From: Peng Fan <peng.fan@nxp.com>
When two consecutive suspend message send to the Linux agent, Linux will
suspend and wake up. The exepcted behaviour should be suspend, wake up
and suspend again.
The ARM SCMI spec does not allow for filtering of which messages an agent
wants to get on the system power protocol. To i.MX95, as we use mailbox
to receive message, and the mailbox supports wake up, so linux will also
get a repeated suspend message. This will cause Linux to wake (and should
then go back into suspend).
In current driver, the state is set back to SCMI_SYSPOWER_IDLE after
pm_suspend finish, however the workqueue could be scheduled after
thaw_kernel_threads. So the 2nd suspend will return early with
"Transition already in progress...ignore", and leave Linux in wakeup
state.
So set SCMI_SYSPOWER_IDLE in device resume phase before workqueue
is scheduled to make the 2nd suspend message could suspend Linux again.
Signed-off-by: Peng Fan <peng.fan@nxp.com>
---
drivers/firmware/arm_scmi/scmi_power_control.c | 24 +++++++++++++++++++-----
1 file changed, 19 insertions(+), 5 deletions(-)
diff --git a/drivers/firmware/arm_scmi/scmi_power_control.c b/drivers/firmware/arm_scmi/scmi_power_control.c
index 21f467a92942883be66074c37c2cab08c3e8a5cc..d2cfd9d92e711f7247a13c7773c11c0a6e582876 100644
--- a/drivers/firmware/arm_scmi/scmi_power_control.c
+++ b/drivers/firmware/arm_scmi/scmi_power_control.c
@@ -46,6 +46,7 @@
#include <linux/math.h>
#include <linux/module.h>
#include <linux/mutex.h>
+#include <linux/pm.h>
#include <linux/printk.h>
#include <linux/reboot.h>
#include <linux/scmi_protocol.h>
@@ -324,12 +325,7 @@ static int scmi_userspace_notifier(struct notifier_block *nb,
static void scmi_suspend_work_func(struct work_struct *work)
{
- struct scmi_syspower_conf *sc =
- container_of(work, struct scmi_syspower_conf, suspend_work);
-
pm_suspend(PM_SUSPEND_MEM);
-
- sc->state = SCMI_SYSPOWER_IDLE;
}
static int scmi_syspower_probe(struct scmi_device *sdev)
@@ -354,6 +350,7 @@ static int scmi_syspower_probe(struct scmi_device *sdev)
sc->required_transition = SCMI_SYSTEM_MAX;
sc->userspace_nb.notifier_call = &scmi_userspace_notifier;
sc->dev = &sdev->dev;
+ dev_set_drvdata(&sdev->dev, sc);
INIT_WORK(&sc->suspend_work, scmi_suspend_work_func);
@@ -363,6 +360,20 @@ static int scmi_syspower_probe(struct scmi_device *sdev)
NULL, &sc->userspace_nb);
}
+static int scmi_system_power_resume(struct device *dev)
+{
+
+ struct scmi_syspower_conf *sc = dev_get_drvdata(dev);
+
+ sc->state = SCMI_SYSPOWER_IDLE;
+
+ return 0;
+}
+
+static const struct dev_pm_ops scmi_system_power_pmops = {
+ SET_SYSTEM_SLEEP_PM_OPS(NULL, scmi_system_power_resume)
+};
+
static const struct scmi_device_id scmi_id_table[] = {
{ SCMI_PROTOCOL_SYSTEM, "syspower" },
{ },
@@ -370,6 +381,9 @@ static const struct scmi_device_id scmi_id_table[] = {
MODULE_DEVICE_TABLE(scmi, scmi_id_table);
static struct scmi_driver scmi_system_power_driver = {
+ .driver = {
+ .pm = &scmi_system_power_pmops,
+ },
.name = "scmi-system-power",
.probe = scmi_syspower_probe,
.id_table = scmi_id_table,
--
2.37.1
On Jun 20, 2025 at 11:37:14 +0800, Peng Fan (OSS) wrote: > From: Peng Fan <peng.fan@nxp.com> > > When two consecutive suspend message send to the Linux agent, Linux will > suspend and wake up. The exepcted behaviour should be suspend, wake up I am first trying to gather more context of the issue at hand here, Why and who is sending 2 consecutive suspend messages to Linux? Just quoting the cover letter: > When testing on i.MX95, two consecutive suspend message send to the Linux > agent, Linux will suspend(by the 1st suspend message) and wake up(by the > 2nd suspend message). > > The ARM SCMI spec does not allow for filtering of which messages an agent > wants to get on the system power protocol. To i.MX95, as we use mailbox > to receive message, and the mailbox supports wake up, so linux will also > get a repeated suspend message. This will cause Linux to wake (and should > then go back into suspend). When you say mailbox supports wake up you mean the mailbox IP in your SoC actually gets some sort of wake interrupt that triggers a wakeup? Is this wakeup sent to the SM then to be processed further and trigger a linux wakeup? <or> the mailbox directly wakes up linux, ie. triggers a resume flow but then you are saying it was an unintentional wakeup so you want to suspend linux again? This just seems like the wakeup routing is incorrect and the system is going through a who resume and then suspend cycle without a good reason? Why and when in this flow is linux ending up with a duplicate suspend message is something I still don't follow. Could you point us to any flow diagrams or software sequences that we could review? > and suspend again. > > The ARM SCMI spec does not allow for filtering of which messages an agent > wants to get on the system power protocol. To i.MX95, as we use mailbox > to receive message, and the mailbox supports wake up, so linux will also > get a repeated suspend message. This will cause Linux to wake (and should > then go back into suspend). > > In current driver, the state is set back to SCMI_SYSPOWER_IDLE after > pm_suspend finish, however the workqueue could be scheduled after > thaw_kernel_threads. So the 2nd suspend will return early with > "Transition already in progress...ignore", and leave Linux in wakeup > state. > > So set SCMI_SYSPOWER_IDLE in device resume phase before workqueue > is scheduled to make the 2nd suspend message could suspend Linux again. > > Signed-off-by: Peng Fan <peng.fan@nxp.com> > --- > drivers/firmware/arm_scmi/scmi_power_control.c | 24 +++++++++++++++++++----- > 1 file changed, 19 insertions(+), 5 deletions(-) > [...] -- Best regards, Dhruva Gole Texas Instruments Incorporated
On Mon, Jun 23, 2025 at 06:27:50PM +0530, Dhruva Gole wrote: >On Jun 20, 2025 at 11:37:14 +0800, Peng Fan (OSS) wrote: >> From: Peng Fan <peng.fan@nxp.com> >> >> When two consecutive suspend message send to the Linux agent, Linux will >> suspend and wake up. The exepcted behaviour should be suspend, wake up > >I am first trying to gather more context of the issue at hand here, >Why and who is sending 2 consecutive suspend messages to Linux? Currently in my test, it is SCMI platform send two suspend messages. But in real cases, other high priviledge agents could send suspend messages to linux agent. One agent may wrongly send two suspend messages by user or the agent is hacked. > >Just quoting the cover letter: > >> When testing on i.MX95, two consecutive suspend message send to the Linux >> agent, Linux will suspend(by the 1st suspend message) and wake up(by the >> 2nd suspend message). >> >> The ARM SCMI spec does not allow for filtering of which messages an agent >> wants to get on the system power protocol. To i.MX95, as we use mailbox >> to receive message, and the mailbox supports wake up, so linux will also >> get a repeated suspend message. This will cause Linux to wake (and should >> then go back into suspend). > >When you say mailbox supports wake up you mean the mailbox IP in your >SoC actually gets some sort of wake interrupt that triggers a wakeup? There is no dedicated wake interrupt for mailbox. The interrupt is the doorbell for processing notification, and this interrupt could also wakeup Linux. >Is this wakeup sent to the SM then to be processed further and trigger a >linux wakeup? No. As above, the mailbox received a doorbell notification interrupt. > ><or> the mailbox directly wakes up linux, ie. triggers a resume flow but >then you are saying it was an unintentional wakeup so you want to >suspend linux again? Right. This just seems like the wakeup routing is >incorrect and the system is going through a who resume and then suspend >cycle without a good reason? > >Why and when in this flow is linux ending up with a duplicate suspend message is >something I still don't follow. Other agents could send duplicated suspend messages, right? We could not expect other agents always behave correctly. > >Could you point us to any flow diagrams or software sequences that we >could review? Not sure what kind diagram or sequences you wanna. It is just one agent wrongly send duplicate suspend message to Linux agent. And Linux agent should suspend again. One more example is Linux suspended, other agent send reboot linux message, Linux should wakeup and reboot itself. Same to suspend Linux suspended, other agent send suspend Linux message, Linux wakeup and suspend again. Regards, Peng > >> and suspend again. >> >> The ARM SCMI spec does not allow for filtering of which messages an agent >> wants to get on the system power protocol. To i.MX95, as we use mailbox >> to receive message, and the mailbox supports wake up, so linux will also >> get a repeated suspend message. This will cause Linux to wake (and should >> then go back into suspend). >> >> In current driver, the state is set back to SCMI_SYSPOWER_IDLE after >> pm_suspend finish, however the workqueue could be scheduled after >> thaw_kernel_threads. So the 2nd suspend will return early with >> "Transition already in progress...ignore", and leave Linux in wakeup >> state. >> >> So set SCMI_SYSPOWER_IDLE in device resume phase before workqueue >> is scheduled to make the 2nd suspend message could suspend Linux again. >> >> Signed-off-by: Peng Fan <peng.fan@nxp.com> >> --- >> drivers/firmware/arm_scmi/scmi_power_control.c | 24 +++++++++++++++++++----- >> 1 file changed, 19 insertions(+), 5 deletions(-) >> >[...] > >-- >Best regards, >Dhruva Gole >Texas Instruments Incorporated
On Mon, Jun 23, 2025 at 10:29:57PM +0800, Peng Fan wrote: > > One more example is > Linux suspended, other agent send reboot linux message, Linux should > wakeup and reboot itself. > > Same to suspend > Linux suspended, other agent send suspend Linux message, Linux wakeup > and suspend again. > These are very valid requirements and if this is not supported or not working as expected, it is a BUG in the current implementation. As lots of details were discussed in private unfortunately, I suggest you to repost the patch with all the additional information discussed there for the benefits of all the people following this list or this thread in particular. It is unfair to not provide full context on the list. Just to summarise my understanding here at very high level, the issue exists as the second notification by an agent to the Linux to suspend the system wakes up the system from suspend state. Since the interrupts are enabled before the thaw_processes() (which eventually continues the execution of scmi_suspend_work_func() to set the state to SCMI_SYSPOWER_IDLE, the scmi_userspace_notifier() is executed much before and ends up ignoring the request as the state is still not set to SCMI_SYSPOWER_IDLE. There is a race which your patch is addressing. -- Regards, Sudeep
On Mon, Jun 23, 2025 at 10:29:57PM +0800, Peng Fan wrote: > On Mon, Jun 23, 2025 at 06:27:50PM +0530, Dhruva Gole wrote: > >On Jun 20, 2025 at 11:37:14 +0800, Peng Fan (OSS) wrote: > >> From: Peng Fan <peng.fan@nxp.com> > >> > >> When two consecutive suspend message send to the Linux agent, Linux will > >> suspend and wake up. The exepcted behaviour should be suspend, wake up > > > >I am first trying to gather more context of the issue at hand here, > >Why and who is sending 2 consecutive suspend messages to Linux? > > Currently in my test, it is SCMI platform send two suspend messages. > But in real cases, other high priviledge agents could send suspend messages > to linux agent. Dont really understand this...a high-privileged supervisor agent would anyway send a suspend/shutdown request to the SCMI server which in turn should be able to filter out such spurious requests...or such suspend request from the supervisor to the agent comes through other non-SCMI means ? > > One agent may wrongly send two suspend messages by user or the agent is hacked. > An agent is NOT capable to send direct notification to another agent... ....notifcation are sent via the P2A channels which means that the server is in charge to send notifs...other agents can cause notifs to be sent NOT send them directly...so that the server can filter-out any hacked request... > > > >Just quoting the cover letter: > > > >> When testing on i.MX95, two consecutive suspend message send to the Linux > >> agent, Linux will suspend(by the 1st suspend message) and wake up(by the > >> 2nd suspend message). > >> > >> The ARM SCMI spec does not allow for filtering of which messages an agent > >> wants to get on the system power protocol. To i.MX95, as we use mailbox > >> to receive message, and the mailbox supports wake up, so linux will also > >> get a repeated suspend message. This will cause Linux to wake (and should > >> then go back into suspend). > > > >When you say mailbox supports wake up you mean the mailbox IP in your > >SoC actually gets some sort of wake interrupt that triggers a wakeup? > > There is no dedicated wake interrupt for mailbox. > > The interrupt is the doorbell for processing notification, and this > interrupt could also wakeup Linux. > > >Is this wakeup sent to the SM then to be processed further and trigger a > >linux wakeup? > > No. As above, the mailbox received a doorbell notification interrupt. > > > > ><or> the mailbox directly wakes up linux, ie. triggers a resume flow but > >then you are saying it was an unintentional wakeup so you want to > >suspend linux again? > > Right. > > This just seems like the wakeup routing is > >incorrect and the system is going through a who resume and then suspend > >cycle without a good reason? > > > >Why and when in this flow is linux ending up with a duplicate suspend message is > >something I still don't follow. > > Other agents could send duplicated suspend messages, right? > We could not expect other agents always behave correctly. > Absolutely, BUT SCMI is a client/server system and the server is in charge to filter-out such requests, since each agent has its own dedicated channel and it is identified by the server as agent_X with capabilities_X from the channel it speaks from (i.e. an agent cannot spoof its identify) ...and the server is the ultimate judge/aribter of any request so that it can drop any unreasonable request...we should NOT delegate such self-protection mechanisms to the agents... > > > >Could you point us to any flow diagrams or software sequences that we > >could review? > > Not sure what kind diagram or sequences you wanna. It is just one agent > wrongly send duplicate suspend message to Linux agent. And Linux agent > should suspend again. > > One more example is > Linux suspended, other agent send reboot linux message, Linux should > wakeup and reboot itself. Yes...another privileged agent request a Reboot for agent_X (SYSPOWER_STATE+_SET) to the server and the server in turn sends a Reboot notification to the suspended agent_X , which is woken up by the notification and proceeds with a graceful shutdown/reboot...if this does NOT happen it is definitely a bug.. > > Same to suspend > Linux suspended, other agent send suspend Linux message, Linux wakeup > and suspend again. In theory yes, it should work like this, BUT better if the 2nd message never come (as explained above)...if this happens, I would say log this as a warning too because it is not a normal scenario...i.e. if you receive multuple suspend to the same agent from the same server... ...something is wrong...I agree Linux should survive (and suspend back) BUT should not be allowed at first (filtered-out) Thanks, Cristian
Hi Peng, kernel test robot noticed the following build warnings: [auto build test WARNING on 4325743c7e209ae7845293679a4de94b969f2bef] url: https://github.com/intel-lab-lkp/linux/commits/Peng-Fan-OSS/firmware-arm_scmi-bus-Add-pm-ops/20250620-114042 base: 4325743c7e209ae7845293679a4de94b969f2bef patch link: https://lore.kernel.org/r/20250620-scmi-pm-v1-2-c2f02cae5122%40nxp.com patch subject: [PATCH 2/2] firmware: arm_scmi: power_control: Set SCMI_SYSPOWER_IDLE in pm resume config: arm-randconfig-002-20250621 (https://download.01.org/0day-ci/archive/20250621/202506210114.2Ix0TkA0-lkp@intel.com/config) compiler: arm-linux-gnueabi-gcc (GCC) 15.1.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250621/202506210114.2Ix0TkA0-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202506210114.2Ix0TkA0-lkp@intel.com/ All warnings (new ones prefixed by >>): >> drivers/firmware/arm_scmi/scmi_power_control.c:363:12: warning: 'scmi_system_power_resume' defined but not used [-Wunused-function] 363 | static int scmi_system_power_resume(struct device *dev) | ^~~~~~~~~~~~~~~~~~~~~~~~ vim +/scmi_system_power_resume +363 drivers/firmware/arm_scmi/scmi_power_control.c 362 > 363 static int scmi_system_power_resume(struct device *dev) 364 { 365 366 struct scmi_syspower_conf *sc = dev_get_drvdata(dev); 367 368 sc->state = SCMI_SYSPOWER_IDLE; 369 370 return 0; 371 } 372 -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki
© 2016 - 2025 Red Hat, Inc.