drivers/spi/spi-cadence-quadspi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
The recent refactoring of where runtime PM is enabled done in commit
f1eb4e792bb1 ("spi: spi-cadence-quadspi: Enable pm runtime earlier to
avoid imbalance") made the fact that when we do a pm_runtime_disable()
in the error paths of probe() we can trigger a runtime disable which in
turn results in duplicate clock disables. Early on in the probe function
we do a pm_runtime_get_noresume() since the probe function leaves the
device in a powered up state but in the error path we can't assume that PM
is enabled so we also manually disable everything, including clocks. This
means that when runtime PM is active both it and the probe function release
the same reference to the main clock for the IP, triggering warnings from
the clock subsystem:
[ 8.693719] clk:75:7 already disabled
[ 8.693791] WARNING: CPU: 1 PID: 185 at /usr/src/kernel/drivers/clk/clk.c:1188 clk_core_disable+0xa0/0xb
...
[ 8.694261] clk_core_disable+0xa0/0xb4 (P)
[ 8.694272] clk_disable+0x38/0x60
[ 8.694283] cqspi_probe+0x7c8/0xc5c [spi_cadence_quadspi]
[ 8.694309] platform_probe+0x5c/0xa4
Avoid this confused ownership by moving the pm_runtime_get_noresume() to
after the last point at which the probe() function can fail.
Reported-by: Francesco Dolcini <francesco@dolcini.it>
Closes: https://lore.kernel.org/r/20251201072844.GA6785@francesco-nb
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
---
drivers/spi/spi-cadence-quadspi.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
index af6d050da1c8..0833b6f666d0 100644
--- a/drivers/spi/spi-cadence-quadspi.c
+++ b/drivers/spi/spi-cadence-quadspi.c
@@ -1985,7 +1985,6 @@ static int cqspi_probe(struct platform_device *pdev)
pm_runtime_enable(dev);
pm_runtime_set_autosuspend_delay(dev, CQSPI_AUTOSUSPEND_TIMEOUT);
pm_runtime_use_autosuspend(dev);
- pm_runtime_get_noresume(dev);
}
ret = cqspi_setup_flash(cqspi);
@@ -2012,6 +2011,7 @@ static int cqspi_probe(struct platform_device *pdev)
}
if (!(ddata && (ddata->quirks & CQSPI_DISABLE_RUNTIME_PM))) {
+ pm_runtime_get_noresume(dev);
pm_runtime_mark_last_busy(dev);
pm_runtime_put_autosuspend(dev);
}
---
base-commit: 7d0a66e4bb9081d75c82ec4957c50034cb0ea449
change-id: 20251202-spi-cadence-qspi-runtime-pm-imbalance-657740cf7eae
Best regards,
--
Mark Brown <broonie@kernel.org>
Hello Mark,
On Tue, Dec 02, 2025 at 10:53:44PM +0000, Mark Brown wrote:
> The recent refactoring of where runtime PM is enabled done in commit
> f1eb4e792bb1 ("spi: spi-cadence-quadspi: Enable pm runtime earlier to
> avoid imbalance") made the fact that when we do a pm_runtime_disable()
> in the error paths of probe() we can trigger a runtime disable which in
> turn results in duplicate clock disables. Early on in the probe function
> we do a pm_runtime_get_noresume() since the probe function leaves the
> device in a powered up state but in the error path we can't assume that PM
> is enabled so we also manually disable everything, including clocks. This
> means that when runtime PM is active both it and the probe function release
> the same reference to the main clock for the IP, triggering warnings from
> the clock subsystem:
>
> [ 8.693719] clk:75:7 already disabled
> [ 8.693791] WARNING: CPU: 1 PID: 185 at /usr/src/kernel/drivers/clk/clk.c:1188 clk_core_disable+0xa0/0xb
> ...
> [ 8.694261] clk_core_disable+0xa0/0xb4 (P)
> [ 8.694272] clk_disable+0x38/0x60
> [ 8.694283] cqspi_probe+0x7c8/0xc5c [spi_cadence_quadspi]
> [ 8.694309] platform_probe+0x5c/0xa4
>
> Avoid this confused ownership by moving the pm_runtime_get_noresume() to
> after the last point at which the probe() function can fail.
>
> Reported-by: Francesco Dolcini <francesco@dolcini.it>
> Closes: https://lore.kernel.org/r/20251201072844.GA6785@francesco-nb
> Signed-off-by: Mark Brown <broonie@kernel.org>
> Cc: stable@vger.kernel.org
Unless I did some stupid mistake testing the patch, it does not fix the issue.
Here the log with v6.18 + this patch
[ 6.347380] ------------[ cut here ]------------
[ 6.352025] clk:75:7 already disabled
[ 6.355727] WARNING: CPU: 2 PID: 180 at drivers/clk/clk.c:1188 clk_core_disable+0xa4/0xac
[ 6.363921] Modules linked in: aes_ce_blk aes_ce_cipher ghash_ce gf128mul snd_soc_simple_card(+)
snd_soc_simple_card_utils spi_cadence_quadspi(+) optee(+) display_connector tee tps65219_pwrbutton u
sb_conn_gpio gpio_keys dwc3_am62 roles k3_j72xx_bandgap ti_ads1015 industrialio_triggered_buffer kfi
fo_buf sa2ul hci_uart sha512 bluetooth rtc_ti_k3 ecdh_generic libsha512 sha256 ecc snd_soc_davinci_m
casp snd_soc_ti_udma rfkill sha1 authenc snd_soc_ti_edma wave5 libaes snd_soc_ti_sdma cdns_dsi cdns_
dphy omap_hwspinlock lm75 omap_mailbox snd_soc_nau8822 i3c ina2xx lontium_lt8912b m_can_platform m_c
an can_dev pwm_tiehrpwm spi_omap2_mcspi fuse ipv6 libsha1 autofs4
[ 6.421984] CPU: 2 UID: 0 PID: 180 Comm: (udev-worker) Not tainted 6.18.0+ #2 PREEMPT
[ 6.429902] Hardware name: Toradex Verdin AM62P WB on Verdin Development Board (DT)
[ 6.437555] pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 6.444515] pc : clk_core_disable+0xa4/0xac
[ 6.448708] lr : clk_core_disable+0xa4/0xac
[ 6.452896] sp : ffff800080bd37f0
[ 6.456234] x29: ffff800080bd37f0 x28: 0000000000000000 x27: ffff000001924810
[ 6.463406] x26: ffff000001924810 x25: ffffd9bd273e0408 x24: ffff000001924800
[ 6.470554] x23: 0000000000000000 x22: ffff000001439800 x21: 00000000ffffffed
[ 6.477693] x20: ffff000002476c00 x19: ffff000002476c00 x18: 0000000000000006
[ 6.484831] x17: 0000000000000000 x16: ffffd9bd3f665260 x15: 0720072007200720
[ 6.491970] x14: 0720072007200720 x13: 0720072007200720 x12: ffffd9bd401ee538
[ 6.499108] x11: 0000000000000058 x10: 0000000000000018 x9 : ffffd9bd401ee538
[ 6.506242] x8 : 0000000000000194 x7 : ffffd9bd40246538 x6 : ffffd9bd40246538
[ 6.513380] x5 : ffff00007fb886c8 x4 : 0000000000000000 x3 : 0000000000000027
[ 6.520518] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000005504300
[ 6.527656] Call trace:
[ 6.530104] clk_core_disable+0xa4/0xac (P)
[ 6.534296] clk_disable+0x30/0x4c
[ 6.537706] cqspi_probe+0x7c0/0xc64 [spi_cadence_quadspi]
[ 6.543209] platform_probe+0x5c/0xa4
[ 6.546885] really_probe+0xc0/0x39c
[ 6.550467] __driver_probe_device+0x7c/0x14c
[ 6.554831] driver_probe_device+0x3c/0x120
[ 6.559022] __driver_attach+0xc4/0x200
[ 6.559387] lt8912 1-0048: supply vccmipirx not found, using dummy regulator
[ 6.562855] bus_for_each_dev+0x7c/0xdc
[ 6.573734] driver_attach+0x24/0x30
[ 6.577315] bus_add_driver+0x110/0x240
[ 6.581158] driver_register+0x68/0x130
[ 6.585001] __platform_driver_register+0x24/0x30
[ 6.589714] cqspi_platform_driver_init+0x20/0x1000 [spi_cadence_quadspi]
[ 6.596516] do_one_initcall+0x60/0x1e0
[ 6.600363] do_init_module+0x54/0x23c
[ 6.604126] load_module+0x1810/0x1f14
[ 6.607886] init_module_from_file+0x88/0xcc
[ 6.612168] __arm64_sys_finit_module+0x264/0x358
[ 6.616883] invoke_syscall.constprop.0+0x50/0xe0
[ 6.619334] lt8912 1-0048: supply vccsysclk not found, using dummy regulator
[ 6.621583] do_el0_svc+0xa8/0xe0
[ 6.631945] el0_svc+0x3c/0x148
[ 6.635095] el0t_64_sync_handler+0xa0/0xf0
[ 6.639285] el0t_64_sync+0x198/0x19c
[ 6.642955] ---[ end trace 0000000000000000 ]---
[ 6.655277] ------------[ cut here ]------------
[ 6.659991] clk:75:7 already unprepared
[ 6.667941] WARNING: CPU: 2 PID: 180 at drivers/clk/clk.c:1047 clk_core_unprepare+0xe4/0x104
[ 6.668689] lt8912 1-0048: supply vcclvdstx not found, using dummy regulator
[ 6.676411] Modules linked in: spidev evdev(+) rng_core aes_ce_blk aes_ce_cipher ghash_ce gf128mu
l snd_soc_simple_card snd_soc_simple_card_utils spi_cadence_quadspi(+) optee display_connector tee t
ps65219_pwrbutton usb_conn_gpio gpio_keys dwc3_am62 roles k3_j72xx_bandgap ti_ads1015 industrialio_t
riggered_buffer kfifo_buf sa2ul hci_uart sha512 bluetooth rtc_ti_k3 ecdh_generic libsha512 sha256 ec
c snd_soc_davinci_mcasp snd_soc_ti_udma rfkill sha1 authenc snd_soc_ti_edma wave5 libaes snd_soc_ti_
sdma cdns_dsi cdns_dphy omap_hwspinlock lm75 omap_mailbox snd_soc_nau8822 i3c ina2xx lontium_lt8912b
m_can_platform m_can can_dev pwm_tiehrpwm spi_omap2_mcspi fuse ipv6 libsha1 autofs4
[ 6.699332] lt8912 1-0048: supply vcchdmitx not found, using dummy regulator
[ 6.743120] CPU: 2 UID: 0 PID: 180 Comm: (udev-worker) Tainted: G W 6.18.0+ #2 P
REEMPT
[ 6.743139] Tainted: [W]=WARN
[ 6.743142] Hardware name: Toradex Verdin AM62P WB on Verdin Development Board (DT)
[ 6.743147] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 6.743153] pc : clk_core_unprepare+0xe4/0x104
[ 6.743168] lr : clk_core_unprepare+0xe4/0x104
[ 6.743175] sp : ffff800080bd37f0
[ 6.743178] x29: ffff800080bd37f0 x28: 0000000000000000 x27: ffff000001924810
[ 6.796577] x26: ffff000001924810 x25: ffffd9bd273e0408 x24: ffff000001924800
[ 6.799311] lt8912 1-0048: supply vcclvdspll not found, using dummy regulator
[ 6.803704] x23: 0000000000000000 x22: ffff000001439800 x21: 00000000ffffffed
[ 6.803713] x20: 0000000000000000 x19: ffff000002476c00 x18: 0000000000000000
[ 6.803723] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
[ 6.803731] x14: 000000000000008f x13: 00000000efe4b99a x12: 00009b471872c248
[ 6.803740] x11: 00000000000000c0 x10: 00000000000009e0 x9 : ffff800080bd3660
[ 6.803749] x8 : ffff000005504d40 x7 : 000000000000b67e x6 : 000000000000ba61
[ 6.803758] x5 : 0000000078b6e217 x4 : 0000000000000818 x3 : 0000000000800000
[ 6.803767] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000005504300
[ 6.803777] Call trace:
[ 6.803781] clk_core_unprepare+0xe4/0x104 (P)
[ 6.825908] lt8912 1-0048: supply vcchdmipll not found, using dummy regulator
[ 6.832255] clk_unprepare+0x2c/0x44
[ 6.832266] cqspi_probe+0x7c8/0xc64 [spi_cadence_quadspi]
[ 6.832286] platform_probe+0x5c/0xa4
[ 6.894596] really_probe+0xc0/0x39c
[ 6.898176] __driver_probe_device+0x7c/0x14c
[ 6.902534] driver_probe_device+0x3c/0x120
[ 6.906715] __driver_attach+0xc4/0x200
[ 6.910548] bus_for_each_dev+0x7c/0xdc
[ 6.914380] driver_attach+0x24/0x30
[ 6.917954] bus_add_driver+0x110/0x240
[ 6.921789] driver_register+0x68/0x130
[ 6.925627] __platform_driver_register+0x24/0x30
[ 6.930333] cqspi_platform_driver_init+0x20/0x1000 [spi_cadence_quadspi]
[ 6.937128] do_one_initcall+0x60/0x1e0
[ 6.940968] do_init_module+0x54/0x23c
[ 6.944720] load_module+0x1810/0x1f14
[ 6.948471] init_module_from_file+0x88/0xcc
[ 6.952740] __arm64_sys_finit_module+0x264/0x358
[ 6.957445] invoke_syscall.constprop.0+0x50/0xe0
[ 6.962149] do_el0_svc+0xa8/0xe0
[ 6.965463] el0_svc+0x3c/0x148
[ 6.968605] el0t_64_sync_handler+0xa0/0xf0
[ 6.972785] el0t_64_sync+0x198/0x19c
[ 6.976447] ---[ end trace 0000000000000000 ]---
On Thu, Dec 04, 2025 at 10:13:35AM +0100, Francesco Dolcini wrote: > Unless I did some stupid mistake testing the patch, it does not fix the issue. > Here the log with v6.18 + this patch Yeah, I'm pretty sure it didn't work. I've managed to find a system which instantiates the IP so hopefully I can modify it to trigger the issue and test directly rather than working blind, I've also noticed that we're getting [ 15.430306] cadence-qspi 13010000.spi: Runtime PM usage count underflow! even in normal operation (on that system anyway) so the runtime PM handling is definitely unhappy.
On 13:28-20251204, Mark Brown wrote: > On Thu, Dec 04, 2025 at 10:13:35AM +0100, Francesco Dolcini wrote: > > > Unless I did some stupid mistake testing the patch, it does not fix the issue. > > Here the log with v6.18 + this patch > > Yeah, I'm pretty sure it didn't work. I've managed to find a system > which instantiates the IP so hopefully I can modify it to trigger the > issue and test directly rather than working blind, I've also noticed > that we're getting > > [ 15.430306] cadence-qspi 13010000.spi: Runtime PM usage count underflow! > > even in normal operation (on that system anyway) so the runtime PM > handling is definitely unhappy. Hit the same issue on J721E platform [1], the following seems to help in my local testing [2]. I am no expert.. but do see if this makes sense The clock is already turned off by the runtime-PM suspend callback, so an extra clk_disable*_unprepare() is only correct when runtime-PM support is not in use. [1] https://dashboard.kernelci.org/log-viewer?itemId=ti%3A4db11a5806594ef99c7c2828&org=ti&type=test&url=http%3A%2F%2Ffiles.kernelci.org%2F%2Fti%2Fmainline%2Fmaster%2Fv6.18-763-g4a26e7032d7d5%2Farm64%2Fdefconfig%2Blab-setup%2Bkselftest%2Fgcc-14%2Fbaseline-nfs-boot.nfs-j721e-idk-gw.txt.gz [2] https://gist.github.com/nmenon/1ebaa7cf18e7df49ab59aaa7eb798a68 diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c index af6d050da1c8..3678e971d6d9 100644 --- a/drivers/spi/spi-cadence-quadspi.c +++ b/drivers/spi/spi-cadence-quadspi.c @@ -2024,7 +2024,9 @@ static int cqspi_probe(struct platform_device *pdev) probe_reset_failed: if (cqspi->is_jh7110) cqspi_jh7110_disable_clk(pdev, cqspi); - clk_disable_unprepare(cqspi->clk); + /* Runtime-PM suspend already disables the core clock. */ + if (ddata && (ddata->quirks & CQSPI_DISABLE_RUNTIME_PM)) + clk_disable_unprepare(cqspi->clk); probe_clk_failed: return ret; } @@ -2048,9 +2050,11 @@ static void cqspi_remove(struct platform_device *pdev) if (cqspi->rx_chan) dma_release_channel(cqspi->rx_chan); - if (!(ddata && (ddata->quirks & CQSPI_DISABLE_RUNTIME_PM))) - if (pm_runtime_get_sync(&pdev->dev) >= 0) - clk_disable(cqspi->clk); + if (ddata && (ddata->quirks & CQSPI_DISABLE_RUNTIME_PM)) + clk_disable_unprepare(cqspi->clk); + else + /* Balance the later pm_runtime_put_sync() */ + pm_runtime_get_sync(&pdev->dev); if (cqspi->is_jh7110) cqspi_jh7110_disable_clk(pdev, cqspi); -- 2.47.0 -- Regards, Nishanth Menon Key (0xDDB5849D1736249D) / Fingerprint: F8A2 8693 54EB 8232 17A3 1A34 DDB5 849D 1736 249D https://ti.com/opensource
On Thu, Dec 04, 2025 at 08:05:30AM -0600, Nishanth Menon wrote: > The clock is already turned off by the runtime-PM suspend callback, so an > extra clk_disable*_unprepare() is only correct when runtime-PM support is > not in use. Right, I'm pretty sure that's where the extra disable is coming from. The pm_runtime_set_active() further up the function is looking rather suspect here. > - clk_disable_unprepare(cqspi->clk); > + /* Runtime-PM suspend already disables the core clock. */ > + if (ddata && (ddata->quirks & CQSPI_DISABLE_RUNTIME_PM)) > + clk_disable_unprepare(cqspi->clk); This will leak the reference if runtime PM isn't enabled unfortunately, no runtime PM operations will get called. Life would be vastly simpler were that mandatory :(
On Thu, Dec 04, 2025 at 03:11:26PM +0000, Mark Brown wrote:
> On Thu, Dec 04, 2025 at 08:05:30AM -0600, Nishanth Menon wrote:
> > The clock is already turned off by the runtime-PM suspend callback, so an
> > extra clk_disable*_unprepare() is only correct when runtime-PM support is
> > not in use.
> Right, I'm pretty sure that's where the extra disable is coming from.
> The pm_runtime_set_active() further up the function is looking rather
> suspect here.
qspi_setup_flash() is just reading DT data, it's not actually
interacting with the hardware at all, so I think we can sidestep the
immediate issue by just moving it to where we parse the DT for the
controller. It's not fixing the actual issue with the missing/extra
clock reference but it does get us back to where we were:
diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
index af6d050da1c8..bdbeef05cd72 100644
--- a/drivers/spi/spi-cadence-quadspi.c
+++ b/drivers/spi/spi-cadence-quadspi.c
@@ -1845,6 +1845,12 @@ static int cqspi_probe(struct platform_device *pdev)
return -ENODEV;
}
+ ret = cqspi_setup_flash(cqspi);
+ if (ret) {
+ dev_err(dev, "failed to setup flash parameters %d\n", ret);
+ return ret;
+ }
+
/* Obtain QSPI clock. */
cqspi->clk = devm_clk_get(dev, NULL);
if (IS_ERR(cqspi->clk)) {
@@ -1988,12 +1994,6 @@ static int cqspi_probe(struct platform_device *pdev)
pm_runtime_get_noresume(dev);
}
- ret = cqspi_setup_flash(cqspi);
- if (ret) {
- dev_err(dev, "failed to setup flash parameters %d\n", ret);
- goto probe_setup_failed;
- }
-
host->num_chipselect = cqspi->num_chipselect;
if (ddata && (ddata->quirks & CQSPI_SUPPORT_DEVICE_RESET))
I'll send this out later assuming nothing blows up in my CI and nobody
else notices an issue.
Please ignore the previous mail.
Hi Mark and Nishanth The below seems to work for me on j721e. Let me
know your thoughts.
Apply over 6987d58a9cbc5bd57c983baa514474a86c945d56.
Also, the error actually comes from :
if (cqspi->use_direct_mode) {
ret = cqspi_request_mmap_dma(cqspi);
if (ret == -EPROBE_DEFER)
goto probe_setup_failed;
}
And not from flash_setup().
diff --git a/drivers/spi/spi-cadence-quadspi.c
b/drivers/spi/spi-cadence-quadspi.c
index af6d050da1c8..492b44976e52 100644
--- a/drivers/spi/spi-cadence-quadspi.c
+++ b/drivers/spi/spi-cadence-quadspi.c
@@ -2024,7 +2024,7 @@ static int cqspi_probe(struct platform_device *pdev)
probe_reset_failed:
if (cqspi->is_jh7110)
cqspi_jh7110_disable_clk(pdev, cqspi);
- clk_disable_unprepare(cqspi->clk);
+ pm_runtime_force_suspend(dev);
probe_clk_failed:
return ret;
}
Regards
Anurag
On 04-12-2025 22:35, Mark Brown wrote:
> On Thu, Dec 04, 2025 at 03:11:26PM +0000, Mark Brown wrote:
>> On Thu, Dec 04, 2025 at 08:05:30AM -0600, Nishanth Menon wrote:
>>> The clock is already turned off by the runtime-PM suspend callback, so an
>>> extra clk_disable*_unprepare() is only correct when runtime-PM support is
>>> not in use.
>> Right, I'm pretty sure that's where the extra disable is coming from.
>> The pm_runtime_set_active() further up the function is looking rather
>> suspect here.
> qspi_setup_flash() is just reading DT data, it's not actually
> interacting with the hardware at all, so I think we can sidestep the
> immediate issue by just moving it to where we parse the DT for the
> controller. It's not fixing the actual issue with the missing/extra
> clock reference but it does get us back to where we were:
>
> diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
> index af6d050da1c8..bdbeef05cd72 100644
> --- a/drivers/spi/spi-cadence-quadspi.c
> +++ b/drivers/spi/spi-cadence-quadspi.c
> @@ -1845,6 +1845,12 @@ static int cqspi_probe(struct platform_device *pdev)
> return -ENODEV;
> }
>
> + ret = cqspi_setup_flash(cqspi);
> + if (ret) {
> + dev_err(dev, "failed to setup flash parameters %d\n", ret);
> + return ret;
> + }
> +
> /* Obtain QSPI clock. */
> cqspi->clk = devm_clk_get(dev, NULL);
> if (IS_ERR(cqspi->clk)) {
> @@ -1988,12 +1994,6 @@ static int cqspi_probe(struct platform_device *pdev)
> pm_runtime_get_noresume(dev);
> }
>
> - ret = cqspi_setup_flash(cqspi);
> - if (ret) {
> - dev_err(dev, "failed to setup flash parameters %d\n", ret);
> - goto probe_setup_failed;
> - }
> -
> host->num_chipselect = cqspi->num_chipselect;
>
> if (ddata && (ddata->quirks & CQSPI_SUPPORT_DEVICE_RESET))
>
> I'll send this out later assuming nothing blows up in my CI and nobody
> else notices an issue.
On Fri, Dec 05, 2025 at 06:28:06PM +0530, Dutta, Anurag wrote:
> Hi Mark and Nishanth The below seems to work for me on j721e. Let me know
> your thoughts.
> Also, the error actually comes from :
> if (cqspi->use_direct_mode) {
> ret = cqspi_request_mmap_dma(cqspi);
> if (ret == -EPROBE_DEFER)
> goto probe_setup_failed;
> }
> And not from flash_setup().
Great, thanks for confirming. We should probably ensure that has some
logging...
> @@ -2024,7 +2024,7 @@ static int cqspi_probe(struct platform_device *pdev)
> probe_reset_failed:
> if (cqspi->is_jh7110)
> cqspi_jh7110_disable_clk(pdev, cqspi);
> - clk_disable_unprepare(cqspi->clk);
> + pm_runtime_force_suspend(dev);
> probe_clk_failed:
The trouble with that is that in the !CONFIG_PM case
pm_runtime_force_suspend() is defined as:
static inline int pm_runtime_force_suspend(struct device *dev) { return 0; }
so we'll leak the clock enable. If we could just require runtime PM
this would be an awful lot easier to deal with.
On 05-12-2025 18:55, Mark Brown wrote:
> On Fri, Dec 05, 2025 at 06:28:06PM +0530, Dutta, Anurag wrote:
>
>> Hi Mark and Nishanth The below seems to work for me on j721e. Let me know
>> your thoughts.
>> Also, the error actually comes from :
>> if (cqspi->use_direct_mode) {
>> ret = cqspi_request_mmap_dma(cqspi);
>> if (ret == -EPROBE_DEFER)
>> goto probe_setup_failed;
>> }
>> And not from flash_setup().
> Great, thanks for confirming. We should probably ensure that has some
> logging...
>
>> @@ -2024,7 +2024,7 @@ static int cqspi_probe(struct platform_device *pdev)
>> probe_reset_failed:
>> if (cqspi->is_jh7110)
>> cqspi_jh7110_disable_clk(pdev, cqspi);
>> - clk_disable_unprepare(cqspi->clk);
>> + pm_runtime_force_suspend(dev);
>> probe_clk_failed:
> The trouble with that is that in the !CONFIG_PM case
> pm_runtime_force_suspend() is defined as:
>
> static inline int pm_runtime_force_suspend(struct device *dev) { return 0; }
>
> so we'll leak the clock enable. If we could just require runtime PM
> this would be an awful lot easier to deal with.
So, can we maintain an internal state of the clock(enabled/disabled) in
the struct cqspi_st ?
Before every, clk_disable_unprepare()/clk_prepare_enable(), we check if
the clock is actually
enabled/disabled by checking the state of "atomic_t clock_enabled"
within struct cqspi_st.
And, when we do clk_disable_unprepare()/clk_prepare_enable(), we set the
value of clock_enabled
accordingly.
Is this a good approach, given we take care of race conditions as well ?
On 09-12-2025 11:13, Dutta, Anurag wrote:
>
> On 05-12-2025 18:55, Mark Brown wrote:
>> On Fri, Dec 05, 2025 at 06:28:06PM +0530, Dutta, Anurag wrote:
>>
>>> Hi Mark and Nishanth The below seems to work for me on j721e. Let me
>>> know
>>> your thoughts.
>>> Also, the error actually comes from :
>>> if (cqspi->use_direct_mode) {
>>> ret = cqspi_request_mmap_dma(cqspi);
>>> if (ret == -EPROBE_DEFER)
>>> goto probe_setup_failed;
>>> }
>>> And not from flash_setup().
>> Great, thanks for confirming. We should probably ensure that has some
>> logging...
>>
>>> @@ -2024,7 +2024,7 @@ static int cqspi_probe(struct platform_device
>>> *pdev)
>>> probe_reset_failed:
>>> if (cqspi->is_jh7110)
>>> cqspi_jh7110_disable_clk(pdev, cqspi);
>>> - clk_disable_unprepare(cqspi->clk);
>>> + pm_runtime_force_suspend(dev);
>>> probe_clk_failed:
>> The trouble with that is that in the !CONFIG_PM case
>> pm_runtime_force_suspend() is defined as:
>>
>> static inline int pm_runtime_force_suspend(struct device *dev) {
>> return 0; }
>>
>> so we'll leak the clock enable. If we could just require runtime PM
>> this would be an awful lot easier to deal with.
> So, can we maintain an internal state of the clock(enabled/disabled)
> in the struct cqspi_st ?
> Before every, clk_disable_unprepare()/clk_prepare_enable(), we check
> if the clock is actually
> enabled/disabled by checking the state of "atomic_t clock_enabled"
> within struct cqspi_st.
> And, when we do clk_disable_unprepare()/clk_prepare_enable(), we set
> the value of clock_enabled
> accordingly.
>
> Is this a good approach, given we take care of race conditions as well ?
>
>
Another solution :
diff --git a/drivers/spi/spi-cadence-quadspi.c
b/drivers/spi/spi-cadence-quadspi.c
index af6d050da1c8..4d298b945f09 100644
--- a/drivers/spi/spi-cadence-quadspi.c
+++ b/drivers/spi/spi-cadence-quadspi.c
@@ -2024,7 +2024,11 @@ static int cqspi_probe(struct platform_device *pdev)
probe_reset_failed:
if (cqspi->is_jh7110)
cqspi_jh7110_disable_clk(pdev, cqspi);
- clk_disable_unprepare(cqspi->clk);
+
+ if (!(ddata && (ddata->quirks & CQSPI_DISABLE_RUNTIME_PM))) {
+ if (pm_runtime_get_sync(&pdev->dev) >= 0)
+ clk_disable_unprepare(cqspi->clk);
+ }
probe_clk_failed:
return ret;
}
pm_runtime_get_sync() will increment the usage count thereby preventing
from runtime_suspend()
getting invoked, thereby preventing double clock_disable()
This will work for !CONFIG_PM as well because pm_runtime_get_sync() will
return 1 then.
and the runtime_suspend() is never going to be invoked.
logs :
with CONFIG_PM :
https://gist.github.com/anuragdutta731/c0080a7592de992e71103e9947639b2a
without CONFIG_PM :
https://gist.github.com/anuragdutta731/f9a79302b65088d53db2c970b6b8eff0
It work for both cases on j721e. Please share your thoughts.
Regards
Anurag
On Tue, Dec 09, 2025 at 03:22:43PM +0530, Dutta, Anurag wrote:
> On 09-12-2025 11:13, Dutta, Anurag wrote:
> Another solution :
> diff --git a/drivers/spi/spi-cadence-quadspi.c
> b/drivers/spi/spi-cadence-quadspi.c
> index af6d050da1c8..4d298b945f09 100644
> --- a/drivers/spi/spi-cadence-quadspi.c
> +++ b/drivers/spi/spi-cadence-quadspi.c
> @@ -2024,7 +2024,11 @@ static int cqspi_probe(struct platform_device *pdev)
> probe_reset_failed:
> if (cqspi->is_jh7110)
> cqspi_jh7110_disable_clk(pdev, cqspi);
> - clk_disable_unprepare(cqspi->clk);
> +
> + if (!(ddata && (ddata->quirks & CQSPI_DISABLE_RUNTIME_PM))) {
> + if (pm_runtime_get_sync(&pdev->dev) >= 0)
> + clk_disable_unprepare(cqspi->clk);
> + }
> pm_runtime_get_sync() will increment the usage count thereby preventing from
> runtime_suspend()
> getting invoked, thereby preventing double clock_disable()
>
> This will work for !CONFIG_PM as well because pm_runtime_get_sync() will
> return 1 then.
> and the runtime_suspend() is never going to be invoked.
I think we want this, possibly using pm_runtime_force_resume() instead
(not 100% sure on that one, glancing at the documentation there might be
issues though it feels like the intent of what we're doing here). Can
you send a patch please?
On 09-12-2025 16:00, Mark Brown wrote:
> On Tue, Dec 09, 2025 at 03:22:43PM +0530, Dutta, Anurag wrote:
>> On 09-12-2025 11:13, Dutta, Anurag wrote:
>> Another solution :
>> diff --git a/drivers/spi/spi-cadence-quadspi.c
>> b/drivers/spi/spi-cadence-quadspi.c
>> index af6d050da1c8..4d298b945f09 100644
>> --- a/drivers/spi/spi-cadence-quadspi.c
>> +++ b/drivers/spi/spi-cadence-quadspi.c
>> @@ -2024,7 +2024,11 @@ static int cqspi_probe(struct platform_device *pdev)
>> probe_reset_failed:
>> if (cqspi->is_jh7110)
>> cqspi_jh7110_disable_clk(pdev, cqspi);
>> - clk_disable_unprepare(cqspi->clk);
>> +
>> + if (!(ddata && (ddata->quirks & CQSPI_DISABLE_RUNTIME_PM))) {
>> + if (pm_runtime_get_sync(&pdev->dev) >= 0)
>> + clk_disable_unprepare(cqspi->clk);
>> + }
>> pm_runtime_get_sync() will increment the usage count thereby preventing from
>> runtime_suspend()
>> getting invoked, thereby preventing double clock_disable()
>>
>> This will work for !CONFIG_PM as well because pm_runtime_get_sync() will
>> return 1 then.
>> and the runtime_suspend() is never going to be invoked.
> I think we want this, possibly using pm_runtime_force_resume() instead
> (not 100% sure on that one, glancing at the documentation there might be
> issues though it feels like the intent of what we're doing here). Can
> you send a patch please?
> Hi Mark
> If we use pm_runtime_force_resume(), then for CONFIG_PM_SLEEP, it will work.
> But, in case of !CONFIG_PM_SLEEP, pm_runtime_force_resume() returns -ENXIO,
> because of which :
>
> if (pm_runtime_get_sync(&pdev->dev) >= 0)
>
> will be become false and clk_disable_unprepare() will never be invoked,
> which might be an issue.
>
> Let me send a patch with pm_runtime_get_sync(). If changes are required,
> please let me know.
>
> Regards
> Anurag
On Fri, Dec 12, 2025 at 10:04:47AM +0530, Dutta, Anurag wrote: > > Let me send a patch with pm_runtime_get_sync(). If changes are required, > > please let me know. BTW note that we do need to have a clock disable for the case where CQSPI_DISABLE_RUNTIME_PM is quirked in as well as the with/without runtime PM cases.
On 12-12-2025 11:37, Mark Brown wrote: > On Fri, Dec 12, 2025 at 10:04:47AM +0530, Dutta, Anurag wrote: > >>> Let me send a patch with pm_runtime_get_sync(). If changes are required, >>> please let me know. > BTW note that we do need to have a clock disable for the case where > CQSPI_DISABLE_RUNTIME_PM is quirked in as well as the with/without > runtime PM cases. ok, will remove this : if (!(ddata && (ddata->quirks & CQSPI_DISABLE_RUNTIME_PM))) because pm_runtime_get_sync() >=0 will be true even if pm_runtime_enable() is never invoked.
On Fri, Dec 12, 2025 at 10:04:47AM +0530, Dutta, Anurag wrote: > On 09-12-2025 16:00, Mark Brown wrote: Quoting seems messed up here - your whole mail is quoted. > > I think we want this, possibly using pm_runtime_force_resume() instead > > (not 100% sure on that one, glancing at the documentation there might be > > issues though it feels like the intent of what we're doing here). Can > > you send a patch please? > > Hi Mark > > If we use pm_runtime_force_resume(), then for CONFIG_PM_SLEEP, it will work. > > But, in case of !CONFIG_PM_SLEEP, pm_runtime_force_resume() returns -ENXIO, > > because of which : Huh, hadn't noticed an interaction with PM_SLEEP. My thinking was partly that this is convenient since it lets you know if runtime PM isn't enabled and you need to unwind the clocks separately. But I don't super mind so long as we've got something that works. > > Let me send a patch with pm_runtime_get_sync(). If changes are required, > > please let me know. Yes, please.
Hi Mark and Nishanth The below seems to work for me on j721e. Let me
know your thoughts. Apply over 6987d58a9cbc5bd57c983baa514474a86c945d56.
Also, the error actually comes from : if (cqspi->use_direct_mode) { ret
= cqspi_request_mmap_dma(cqspi); if (ret == -EPROBE_DEFER) goto
probe_setup_failed; } And not from flash_setup().
diff --git a/drivers/spi/spi-cadence-quadspi.c
b/drivers/spi/spi-cadence-quadspi.c index af6d050da1c8..492b44976e52
100644 --- a/drivers/spi/spi-cadence-quadspi.c +++
b/drivers/spi/spi-cadence-quadspi.c @@ -2024,7 +2024,7 @@ static int
cqspi_probe(struct platform_device *pdev) probe_reset_failed: if
(cqspi->is_jh7110) cqspi_jh7110_disable_clk(pdev, cqspi); -
clk_disable_unprepare(cqspi->clk); + pm_runtime_force_suspend(dev);
probe_clk_failed: return ret; }
Regards Anurag
On 04-12-2025 22:35, Mark Brown wrote:
> On Thu, Dec 04, 2025 at 03:11:26PM +0000, Mark Brown wrote:
>> On Thu, Dec 04, 2025 at 08:05:30AM -0600, Nishanth Menon wrote:
>>> The clock is already turned off by the runtime-PM suspend callback, so an
>>> extra clk_disable*_unprepare() is only correct when runtime-PM support is
>>> not in use.
>> Right, I'm pretty sure that's where the extra disable is coming from.
>> The pm_runtime_set_active() further up the function is looking rather
>> suspect here.
> qspi_setup_flash() is just reading DT data, it's not actually
> interacting with the hardware at all, so I think we can sidestep the
> immediate issue by just moving it to where we parse the DT for the
> controller. It's not fixing the actual issue with the missing/extra
> clock reference but it does get us back to where we were:
>
> diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
> index af6d050da1c8..bdbeef05cd72 100644
> --- a/drivers/spi/spi-cadence-quadspi.c
> +++ b/drivers/spi/spi-cadence-quadspi.c
> @@ -1845,6 +1845,12 @@ static int cqspi_probe(struct platform_device *pdev)
> return -ENODEV;
> }
>
> + ret = cqspi_setup_flash(cqspi);
> + if (ret) {
> + dev_err(dev, "failed to setup flash parameters %d\n", ret);
> + return ret;
> + }
> +
> /* Obtain QSPI clock. */
> cqspi->clk = devm_clk_get(dev, NULL);
> if (IS_ERR(cqspi->clk)) {
> @@ -1988,12 +1994,6 @@ static int cqspi_probe(struct platform_device *pdev)
> pm_runtime_get_noresume(dev);
> }
>
> - ret = cqspi_setup_flash(cqspi);
> - if (ret) {
> - dev_err(dev, "failed to setup flash parameters %d\n", ret);
> - goto probe_setup_failed;
> - }
> -
> host->num_chipselect = cqspi->num_chipselect;
>
> if (ddata && (ddata->quirks & CQSPI_SUPPORT_DEVICE_RESET))
>
> I'll send this out later assuming nothing blows up in my CI and nobody
> else notices an issue.
On Thu, Dec 04, 2025 at 01:28:05PM +0000, Mark Brown wrote: > issue and test directly rather than working blind, I've also noticed > that we're getting > [ 15.430306] cadence-qspi 13010000.spi: Runtime PM usage count underflow! > even in normal operation (on that system anyway) so the runtime PM > handling is definitely unhappy. Ah, I applied a patch the other day fixing that one.
On Tue, 2025-12-02 at 22:53 +0000, Mark Brown wrote:
> The recent refactoring of where runtime PM is enabled done in commit
> f1eb4e792bb1 ("spi: spi-cadence-quadspi: Enable pm runtime earlier to
> avoid imbalance") made the fact that when we do a pm_runtime_disable()
> in the error paths of probe() we can trigger a runtime disable which in
> turn results in duplicate clock disables. Early on in the probe function
> we do a pm_runtime_get_noresume() since the probe function leaves the
> device in a powered up state but in the error path we can't assume that PM
> is enabled so we also manually disable everything, including clocks. This
> means that when runtime PM is active both it and the probe function release
> the same reference to the main clock for the IP, triggering warnings from
> the clock subsystem:
>
> [ 8.693719] clk:75:7 already disabled
> [ 8.693791] WARNING: CPU: 1 PID: 185 at /usr/src/kernel/drivers/clk/clk.c:1188 clk_core_disable+0xa0/0xb
> ...
> [ 8.694261] clk_core_disable+0xa0/0xb4 (P)
> [ 8.694272] clk_disable+0x38/0x60
> [ 8.694283] cqspi_probe+0x7c8/0xc5c [spi_cadence_quadspi]
> [ 8.694309] platform_probe+0x5c/0xa4
>
> Avoid this confused ownership by moving the pm_runtime_get_noresume() to
> after the last point at which the probe() function can fail.
>
> Reported-by: Francesco Dolcini <francesco@dolcini.it>
> Closes: https://lore.kernel.org/r/20251201072844.GA6785@francesco-nb
> Signed-off-by: Mark Brown <broonie@kernel.org>
> Cc: stable@vger.kernel.org
> ---
Thank you for the patch. The 'clock already disabled' issue persists on
J721E SoC with the patch applied:
https://gist.github.com/Siddharth-Vadapalli-at-TI/a664f59366ad681856b862d621487b7f
> drivers/spi/spi-cadence-quadspi.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c
> index af6d050da1c8..0833b6f666d0 100644
> --- a/drivers/spi/spi-cadence-quadspi.c
> +++ b/drivers/spi/spi-cadence-quadspi.c
> @@ -1985,7 +1985,6 @@ static int cqspi_probe(struct platform_device *pdev)
> pm_runtime_enable(dev);
> pm_runtime_set_autosuspend_delay(dev, CQSPI_AUTOSUSPEND_TIMEOUT);
> pm_runtime_use_autosuspend(dev);
> - pm_runtime_get_noresume(dev);
> }
>
> ret = cqspi_setup_flash(cqspi);
> @@ -2012,6 +2011,7 @@ static int cqspi_probe(struct platform_device *pdev)
> }
>
> if (!(ddata && (ddata->quirks & CQSPI_DISABLE_RUNTIME_PM))) {
> + pm_runtime_get_noresume(dev);
> pm_runtime_mark_last_busy(dev);
> pm_runtime_put_autosuspend(dev);
> }
Regards,
Siddharth.
On Wed, Dec 03, 2025 at 04:12:04PM +0530, Siddharth Vadapalli wrote: > On Tue, 2025-12-02 at 22:53 +0000, Mark Brown wrote: > > Avoid this confused ownership by moving the pm_runtime_get_noresume() to > > after the last point at which the probe() function can fail. > Thank you for the patch. The 'clock already disabled' issue persists on > J721E SoC with the patch applied: Do you mean it reappears rather than persists? > https://gist.github.com/Siddharth-Vadapalli-at-TI/a664f59366ad681856b862d621487b7f Interesting, I can't see any logging that indicates we should be reaching the error handling paths in that log...
© 2016 - 2025 Red Hat, Inc.