Hello all,
After commit 1d6161617c10 (“arm64: dts: ti: k3-am62-ti-ipc-firmware:
Refactor IPC cfg into new dtsi”), suspend-to-RAM stopped working on
AM62x.
When I originally tested that change, I did not test suspend-to-RAM
functionality, but our testing infrastructure caught this regression.
See the log below:
root@verdin-am62-15479173:~# cat /sys/class/remoteproc/remoteproc*/state
offline
offline
offline
root@verdin-am62-15479173:~# echo mem > /sys/power/state
[ 37.798686] PM: suspend entry (deep)
[ 37.805942] Filesystems sync: 0.003 seconds
[ 37.811965] Freezing user space processes
[ 37.819214] Freezing user space processes completed (elapsed 0.002 seconds)
[ 37.826469] OOM killer disabled.
[ 37.829721] Freezing remaining freezable tasks
[ 37.835557] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
[ 37.843057] printk: Suspending console(s) (use no_console_suspend to debug)
[ 37.953874] omap-mailbox 29000000.mailbox: fifo 5 has unexpected unread messages
[ 37.953909] omap-mailbox 29000000.mailbox: PM: dpm_run_callback(): platform_pm_suspend returns -16
[ 37.953941] omap-mailbox 29000000.mailbox: PM: failed to suspend: error -16
[ 37.953967] PM: Some devices failed to suspend, or early wake event detected
[ 37.973876] am65-cpsw-nuss 8000000.ethernet: set new flow-id-base 19
[ 37.984655] am65-cpsw-nuss 8000000.ethernet end0: PHY [8000f00.mdio:00] driver [TI DP83867] (irq=353)
[ 37.985655] am65-cpsw-nuss 8000000.ethernet end0: configuring for phy/rgmii-rxid link mode
[ 38.009002] usb-conn-gpio connector: repeated role: device
[ 38.013377] lt8912 1-0048: PM: dpm_run_callback(): lt8912_bridge_resume [lontium_lt8912b] returns -121
[ 38.013420] lt8912 1-0048: PM: failed to resume async: error -121
[ 38.153252] OOM killer enabled.
[ 38.156422] Restarting tasks: Starting
[ 38.163532] Restarting tasks: Done
[ 38.167252] random: crng reseeded on system resumption
[ 38.173031] PM: suspend exit
The omap-mailbox driver returns -EBUSY because it detects an unexpected
unread message on FIFO 5. As I understand it, this FIFO corresponds to
the communication channel between the DM R5 and the Cortex-M4 cores.
DM R5 sends a message that is never consumed, since no firmware is
running on the M4 (the core is offline). This unhandled message prevents
the system from entering suspend.
This issue also appears on the downstream TI kernel, which I reported
earlier [1] (for reference).
The following patch resolves the problem:
diff --git a/arch/arm64/boot/dts/ti/k3-am62-ti-ipc-firmware.dtsi b/arch/arm64/boot/dts/ti/k3-am62-ti-ipc-firmware.dtsi
index ea69fab9b52b..e07cf3290cc3 100644
--- a/arch/arm64/boot/dts/ti/k3-am62-ti-ipc-firmware.dtsi
+++ b/arch/arm64/boot/dts/ti/k3-am62-ti-ipc-firmware.dtsi
@@ -26,11 +26,6 @@ mbox_m4_0: mbox-m4-0 {
ti,mbox-rx = <0 0 0>;
ti,mbox-tx = <1 0 0>;
};
-
- mbox_r5_0: mbox-r5-0 {
- ti,mbox-rx = <2 0 0>;
- ti,mbox-tx = <3 0 0>;
- };
};
&mcu_m4fss {
@@ -45,7 +40,6 @@ &wkup_r5fss0 {
};
&wkup_r5fss0_core0 {
- mboxes = <&mailbox0_cluster0 &mbox_r5_0>;
memory-region = <&wkup_r5fss0_core0_dma_memory_region>,
<&wkup_r5fss0_core0_memory_region>;
status = "okay";
Ultimately this issue is related to the omap driver itself:
1. We should have a functionatlly to save and restore the messages into
the mailbox, instead of preveting it to go into suspend.
2. Or we could not check all 16 FIFOs if the kernel does not own them:
for (fifo = 0; fifo < mdev->num_fifos; fifo++) {
if (mbox_read_reg(mdev, MAILBOX_MSGSTATUS(fifo))) {
dev_err(mdev->dev, "fifo %d has unexpected unread messages\n",
fifo);
return -EBUSY;
}
}
Setting the number of FIFOs to 4 in the device tree also resolves this
issue.
Do you have suggestions on how best to fix this in the driver, or should
we consider reverting the DTS change until suspend-to-RAM works again?
#regzbot introduced: 1d6161617c
[1] https://e2e.ti.com/support/processors-group/processors/f/processors-forum/1557295/am62p-mailbox-channel-is-not-freed-during-r5-remoteproc-stop-call/6069413
Best regards,
Hiago.
Hi Hiago,
On 20/10/25 19:47, Hiago De Franco wrote:
> Hello all,
>
> After commit 1d6161617c10 (“arm64: dts: ti: k3-am62-ti-ipc-firmware:
> Refactor IPC cfg into new dtsi”), suspend-to-RAM stopped working on
> AM62x.
The above commit is only refactoring changes and should not
cause any trouble. I think the commit you are interested in
should be: a49f991e740f ("arm64: dts: ti: k3-am62-verdin:
Add missing cfg for TI IPC Firmware").
>
> When I originally tested that change, I did not test suspend-to-RAM
> functionality, but our testing infrastructure caught this regression.
>
> See the log below:
>
> root@verdin-am62-15479173:~# cat /sys/class/remoteproc/remoteproc*/state
> offline
> offline
> offline
> root@verdin-am62-15479173:~# echo mem > /sys/power/state
> [ 37.798686] PM: suspend entry (deep)
> [ 37.805942] Filesystems sync: 0.003 seconds
> [ 37.811965] Freezing user space processes
> [ 37.819214] Freezing user space processes completed (elapsed 0.002 seconds)
> [ 37.826469] OOM killer disabled.
> [ 37.829721] Freezing remaining freezable tasks
> [ 37.835557] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
> [ 37.843057] printk: Suspending console(s) (use no_console_suspend to debug)
> [ 37.953874] omap-mailbox 29000000.mailbox: fifo 5 has unexpected unread messages
> [ 37.953909] omap-mailbox 29000000.mailbox: PM: dpm_run_callback(): platform_pm_suspend returns -16
> [ 37.953941] omap-mailbox 29000000.mailbox: PM: failed to suspend: error -16
> [ 37.953967] PM: Some devices failed to suspend, or early wake event detected
> [ 37.973876] am65-cpsw-nuss 8000000.ethernet: set new flow-id-base 19
> [ 37.984655] am65-cpsw-nuss 8000000.ethernet end0: PHY [8000f00.mdio:00] driver [TI DP83867] (irq=353)
> [ 37.985655] am65-cpsw-nuss 8000000.ethernet end0: configuring for phy/rgmii-rxid link mode
> [ 38.009002] usb-conn-gpio connector: repeated role: device
> [ 38.013377] lt8912 1-0048: PM: dpm_run_callback(): lt8912_bridge_resume [lontium_lt8912b] returns -121
> [ 38.013420] lt8912 1-0048: PM: failed to resume async: error -121
> [ 38.153252] OOM killer enabled.
> [ 38.156422] Restarting tasks: Starting
> [ 38.163532] Restarting tasks: Done
> [ 38.167252] random: crng reseeded on system resumption
> [ 38.173031] PM: suspend exit
>
> The omap-mailbox driver returns -EBUSY because it detects an unexpected
> unread message on FIFO 5. As I understand it, this FIFO corresponds to
> the communication channel between the DM R5 and the Cortex-M4 cores.
>
> DM R5 sends a message that is never consumed, since no firmware is
> running on the M4 (the core is offline).
May I know why you are not running any firmware on the M4
rproc? If the intention is just to run the DM R5 core on the SoC,
you can disable the IPC by NOT including the
"k3-am62-ti-ipc-firmware.dtsi". That was the motivation for the
refactoring.
> This unhandled message prevents
> the system from entering suspend.
The underlying problem is in the mailbox driver handling,
see below.
>
> This issue also appears on the downstream TI kernel, which I reported
> earlier [1] (for reference).
>
> The following patch resolves the problem:
>
> diff --git a/arch/arm64/boot/dts/ti/k3-am62-ti-ipc-firmware.dtsi b/arch/arm64/boot/dts/ti/k3-am62-ti-ipc-firmware.dtsi
> index ea69fab9b52b..e07cf3290cc3 100644
> --- a/arch/arm64/boot/dts/ti/k3-am62-ti-ipc-firmware.dtsi
> +++ b/arch/arm64/boot/dts/ti/k3-am62-ti-ipc-firmware.dtsi
> @@ -26,11 +26,6 @@ mbox_m4_0: mbox-m4-0 {
> ti,mbox-rx = <0 0 0>;
> ti,mbox-tx = <1 0 0>;
> };
> -
> - mbox_r5_0: mbox-r5-0 {
> - ti,mbox-rx = <2 0 0>;
> - ti,mbox-tx = <3 0 0>;
> - };
> };
>
> &mcu_m4fss {
> @@ -45,7 +40,6 @@ &wkup_r5fss0 {
> };
>
> &wkup_r5fss0_core0 {
> - mboxes = <&mailbox0_cluster0 &mbox_r5_0>;
> memory-region = <&wkup_r5fss0_core0_dma_memory_region>,
> <&wkup_r5fss0_core0_memory_region>;
> status = "okay";
>
> Ultimately this issue is related to the omap driver itself:
>
> 1. We should have a functionatlly to save and restore the messages into
> the mailbox, instead of preveting it to go into suspend.
Quoting Hari:
"Restoring the stale mailbox messages could actually create
problems, depending on how the mailbox messages are used in
the IPC. If they hold indexes/pointers to some other IPC
structures or buffers(remember RTOS-RTOS IPC has notify
messaging in addition to RP messages) created dynamically
could lead to fatal errors."
>
> 2. Or we could not check all 16 FIFOs if the kernel does not own them:
>
> for (fifo = 0; fifo < mdev->num_fifos; fifo++) {
> if (mbox_read_reg(mdev, MAILBOX_MSGSTATUS(fifo))) {
> dev_err(mdev->dev, "fifo %d has unexpected unread messages\n",
> fifo);
> return -EBUSY;
> }
> }
>
> Setting the number of FIFOs to 4 in the device tree also resolves this
> issue.
This is avoiding the issue, IMHO its better to flush the pending
messages in the mbox driver while entering suspend as we are
just rebooting rprocs for S/R today. Whenever rprocs can
actually "resume" context from the point they were
"suspended", then we can add support to restore mailbox
messages too.
>
> Do you have suggestions on how best to fix this in the driver, or should
> we consider reverting the DTS change until suspend-to-RAM works again?
List of suggestions/solutions in order of preference:
1. If no intention to enable IPC on rprocs:
Do _not_ include k3-am62-ti-ipc-firmware.dtsi
2. If intention is to enable IPC on rprocs:
Make sure rproc firmware is available in rootfs.
rproc would boot up and consume the mbox
msg, suspend would be successful. Tested this
on TI AM62x-sk with commit 1d6161617c, works
3. Add support in mbox driver to flush the pending
queues.
>
> #regzbot introduced: 1d6161617c
Would not see this as a regression, but rather a new
bug for the omap-mailbox driver...
Thanks,
Beleswar
>
> [1] https://e2e.ti.com/support/processors-group/processors/f/processors-forum/1557295/am62p-mailbox-channel-is-not-freed-during-r5-remoteproc-stop-call/6069413
>
> Best regards,
> Hiago.
On 21/10/25 14:33, Beleswar Prasad Padhi wrote:
> Hi Hiago,
>
> On 20/10/25 19:47, Hiago De Franco wrote:
>> Hello all,
>>
>> After commit 1d6161617c10 (“arm64: dts: ti: k3-am62-ti-ipc-firmware:
>> Refactor IPC cfg into new dtsi”), suspend-to-RAM stopped working on
>> AM62x.
>
> The above commit is only refactoring changes and should not
> cause any trouble. I think the commit you are interested in
> should be: a49f991e740f ("arm64: dts: ti: k3-am62-verdin:
> Add missing cfg for TI IPC Firmware").
>
<snip>
>
>> Do you have suggestions on how best to fix this in the driver, or should
>> we consider reverting the DTS change until suspend-to-RAM works again?
>
> List of suggestions/solutions in order of preference:
> 1. If no intention to enable IPC on rprocs:
> Do _not_ include k3-am62-ti-ipc-firmware.dtsi
> 2. If intention is to enable IPC on rprocs:
> Make sure rproc firmware is available in rootfs.
> rproc would boot up and consume the mbox
> msg, suspend would be successful. Tested this
> on TI AM62x-sk with commit 1d6161617c, works
> 3. Add support in mbox driver to flush the pending
> queues.
Posted a RFC version for #3:
https://lore.kernel.org/all/20251022102015.1345696-1-b-padhi@ti.com/
It still has open questions regarding scenarios with
FIFO firewalling and supporting existing OMAP SoCs
which could restore context upon resume.
Cc: Andrew, Hari
Thanks,
Beleswar
On Tue, Oct 21, 2025 at 02:33:10PM +0530, Beleswar Prasad Padhi wrote: > On 20/10/25 19:47, Hiago De Franco wrote: > > DM R5 sends a message that is never consumed, since no firmware is > > running on the M4 (the core is offline). > > > May I know why you are not running any firmware on the M4 > rproc? If the intention is just to run the DM R5 core on the SoC, > you can disable the IPC by NOT including the > "k3-am62-ti-ipc-firmware.dtsi". That was the motivation for the > refactoring. Verdin AM62 and AM62P are generic SoMs, that can be used for a multitude of different use cases. And not having anything running on the M4 is the default use case. I think having the node in the DT is the correct way forward, if you want to start the M4 firmware you need such a node, so this is enabling a valid and useful use case. > List of suggestions/solutions in order of preference: > 1. If no intention to enable IPC on rprocs: > Do _not_ include k3-am62-ti-ipc-firmware.dtsi > 2. If intention is to enable IPC on rprocs: > Make sure rproc firmware is available in rootfs. > rproc would boot up and consume the mbox > msg, suspend would be successful. Tested this > on TI AM62x-sk with commit 1d6161617c, works > 3. Add support in mbox driver to flush the pending > queues. 2 is not applicable here, and 1 to me is not a good solution. So this means that we need #3. > > #regzbot introduced: 1d6161617c > > Would not see this as a regression, but rather a new > bug for the omap-mailbox driver... As a user this is just a regression. It worked fine before, it's not working anymore now. The fact that the solution might not be in the same file that introduced the issue is not a reason for this not being considered a regression. Francesco
On 21/10/25 15:04, Francesco Dolcini wrote: > On Tue, Oct 21, 2025 at 02:33:10PM +0530, Beleswar Prasad Padhi wrote: >> On 20/10/25 19:47, Hiago De Franco wrote: >>> DM R5 sends a message that is never consumed, since no firmware is >>> running on the M4 (the core is offline). >> >> May I know why you are not running any firmware on the M4 >> rproc? If the intention is just to run the DM R5 core on the SoC, >> you can disable the IPC by NOT including the >> "k3-am62-ti-ipc-firmware.dtsi". That was the motivation for the >> refactoring. > Verdin AM62 and AM62P are generic SoMs, that can be used for a multitude > of different use cases. And not having anything running on the M4 is the > default use case. If not having anything on M4 is the default use case, it should be marked as "status=disabled" in the DT. > > I think having the node in the DT is the correct way forward, if you > want to start the M4 firmware you need such a node, so this is enabling > a valid and useful use case. Having the node is fine, you can still choose to keep it disabled by default. > >> List of suggestions/solutions in order of preference: >> 1. If no intention to enable IPC on rprocs: >> Do _not_ include k3-am62-ti-ipc-firmware.dtsi >> 2. If intention is to enable IPC on rprocs: >> Make sure rproc firmware is available in rootfs. >> rproc would boot up and consume the mbox >> msg, suspend would be successful. Tested this >> on TI AM62x-sk with commit 1d6161617c, works >> 3. Add support in mbox driver to flush the pending >> queues. > 2 is not applicable here, and 1 to me is not a good solution. Why not? Why would you power on the rproc, enable the mailboxes, carveout some memory if you never intend to use it? > So this > means that we need #3. > >>> #regzbot introduced: 1d6161617c >> Would not see this as a regression, but rather a new >> bug for the omap-mailbox driver... > As a user this is just a regression. It worked fine before, it's not > working anymore now. Isn't this partly dependent on the filesystem as well? You would not see this behavior if you package the firmware in rootfs, which I assume you did while testing a49f991e740f https://lore.kernel.org/all/20250908142826.1828676-17-b-padhi@ti.com/ > > The fact that the solution might not be in the same file that introduced > the issue is not a reason for this not being considered a regression. > > Francesco >
On Tue, 2025-10-21 at 15:26 +0530, Beleswar Prasad Padhi wrote: > On 21/10/25 15:04, Francesco Dolcini wrote: > > On Tue, Oct 21, 2025 at 02:33:10PM +0530, Beleswar Prasad Padhi wrote: > > > On 20/10/25 19:47, Hiago De Franco wrote: > > > > DM R5 sends a message that is never consumed, since no firmware is > > > > running on the M4 (the core is offline). > > > > > > May I know why you are not running any firmware on the M4 > > > rproc? If the intention is just to run the DM R5 core on the SoC, > > > you can disable the IPC by NOT including the > > > "k3-am62-ti-ipc-firmware.dtsi". That was the motivation for the > > > refactoring. > > Verdin AM62 and AM62P are generic SoMs, that can be used for a multitude > > of different use cases. And not having anything running on the M4 is the > > default use case. > > > If not having anything on M4 is the default use case, it should > be marked as "status=disabled" in the DT. > > > > > I think having the node in the DT is the correct way forward, if you > > want to start the M4 firmware you need such a node, so this is enabling > > a valid and useful use case. > > > Having the node is fine, you can still choose to keep it > disabled by default. I agree with Francenso that it would be nice to keep the node enabled by default - whether something is running on the M4 can be controlled via sysfs after all, and may change over the runtime of the OS. On our TQ starterkit mainboards, we'd like to provide the option to build the BSP with or without M4 firmware without having to modify the DTS (which is supposed to describe the hardware after all - I'm aware that this principle has its limits, as the DT also needs to reserve memory ranges for MCU firmware usage, but having a few unused memory reservations isn't as disruptive as breaking suspend when the M4 is not running). Best, Matthias > > > > > > List of suggestions/solutions in order of preference: > > > 1. If no intention to enable IPC on rprocs: > > > Do _not_ include k3-am62-ti-ipc-firmware.dtsi > > > 2. If intention is to enable IPC on rprocs: > > > Make sure rproc firmware is available in rootfs. > > > rproc would boot up and consume the mbox > > > msg, suspend would be successful. Tested this > > > on TI AM62x-sk with commit 1d6161617c, works > > > 3. Add support in mbox driver to flush the pending > > > queues. > > 2 is not applicable here, and 1 to me is not a good solution. > > > Why not? Why would you power on the rproc, enable > the mailboxes, carveout some memory if you never > intend to use it? > > > So this > > means that we need #3. > > > > > > #regzbot introduced: 1d6161617c > > > Would not see this as a regression, but rather a new > > > bug for the omap-mailbox driver... > > As a user this is just a regression. It worked fine before, it's not > > working anymore now. > > > Isn't this partly dependent on the filesystem as well? > You would not see this behavior if you package the > firmware in rootfs, which I assume you did while > testing a49f991e740f > > https://lore.kernel.org/all/20250908142826.1828676-17-b-padhi@ti.com/ > > > > > The fact that the solution might not be in the same file that introduced > > the issue is not a reason for this not being considered a regression. > > > > Francesco > > -- TQ-Systems GmbH | Mühlstraße 2, Gut Delling | 82229 Seefeld, Germany Amtsgericht München, HRB 105018 Geschäftsführer: Detlef Schneider, Rüdiger Stahl, Stefan Schneider https://www.tq-group.com/
On Tue, Oct 21, 2025 at 12:06:32PM +0200, Matthias Schiffer wrote: > On Tue, 2025-10-21 at 15:26 +0530, Beleswar Prasad Padhi wrote: > > On 21/10/25 15:04, Francesco Dolcini wrote: > > > On Tue, Oct 21, 2025 at 02:33:10PM +0530, Beleswar Prasad Padhi wrote: > > > > On 20/10/25 19:47, Hiago De Franco wrote: > > > > > DM R5 sends a message that is never consumed, since no firmware is > > > > > running on the M4 (the core is offline). > > > > > > > > May I know why you are not running any firmware on the M4 > > > > rproc? If the intention is just to run the DM R5 core on the SoC, > > > > you can disable the IPC by NOT including the > > > > "k3-am62-ti-ipc-firmware.dtsi". That was the motivation for the > > > > refactoring. > > > Verdin AM62 and AM62P are generic SoMs, that can be used for a multitude > > > of different use cases. And not having anything running on the M4 is the > > > default use case. > > > > > > If not having anything on M4 is the default use case, it should > > be marked as "status=disabled" in the DT. > > > > > > > > I think having the node in the DT is the correct way forward, if you > > > want to start the M4 firmware you need such a node, so this is enabling > > > a valid and useful use case. > > > > > > Having the node is fine, you can still choose to keep it > > disabled by default. > > I agree with Francenso that it would be nice to keep the node enabled by default > - whether something is running on the M4 can be controlled via sysfs after all, > and may change over the runtime of the OS. In addition, from what I know, this is required even if you want to to start the firmware of the M4 from U-Boot. Francesco
© 2016 - 2026 Red Hat, Inc.