[PATCH 2/2] firmware: arm_scmi: power_control: Set SCMI_SYSPOWER_IDLE in pm resume

Peng Fan (OSS) posted 2 patches 3 months, 3 weeks ago
There is a newer version of this series
[PATCH 2/2] firmware: arm_scmi: power_control: Set SCMI_SYSPOWER_IDLE in pm resume
Posted by Peng Fan (OSS) 3 months, 3 weeks ago
From: Peng Fan <peng.fan@nxp.com>

When two consecutive suspend message send to the Linux agent, Linux will
suspend and wake up. The exepcted behaviour should be suspend, wake up
and suspend again.

The ARM SCMI spec does not allow for filtering of which messages an agent
wants to get on the system power protocol. To i.MX95, as we use mailbox
to receive message, and the mailbox supports wake up, so linux will also
get a repeated suspend message. This will cause Linux to wake (and should
then go back into suspend).

In current driver, the state is set back to SCMI_SYSPOWER_IDLE after
pm_suspend finish, however the workqueue could be scheduled after
thaw_kernel_threads. So the 2nd suspend will return early with
"Transition already in progress...ignore", and leave Linux in wakeup
state.

So set SCMI_SYSPOWER_IDLE in device resume phase before workqueue
is scheduled to make the 2nd suspend message could suspend Linux again.

Signed-off-by: Peng Fan <peng.fan@nxp.com>
---
 drivers/firmware/arm_scmi/scmi_power_control.c | 24 +++++++++++++++++++-----
 1 file changed, 19 insertions(+), 5 deletions(-)

diff --git a/drivers/firmware/arm_scmi/scmi_power_control.c b/drivers/firmware/arm_scmi/scmi_power_control.c
index 21f467a92942883be66074c37c2cab08c3e8a5cc..d2cfd9d92e711f7247a13c7773c11c0a6e582876 100644
--- a/drivers/firmware/arm_scmi/scmi_power_control.c
+++ b/drivers/firmware/arm_scmi/scmi_power_control.c
@@ -46,6 +46,7 @@
 #include <linux/math.h>
 #include <linux/module.h>
 #include <linux/mutex.h>
+#include <linux/pm.h>
 #include <linux/printk.h>
 #include <linux/reboot.h>
 #include <linux/scmi_protocol.h>
@@ -324,12 +325,7 @@ static int scmi_userspace_notifier(struct notifier_block *nb,
 
 static void scmi_suspend_work_func(struct work_struct *work)
 {
-	struct scmi_syspower_conf *sc =
-		container_of(work, struct scmi_syspower_conf, suspend_work);
-
 	pm_suspend(PM_SUSPEND_MEM);
-
-	sc->state = SCMI_SYSPOWER_IDLE;
 }
 
 static int scmi_syspower_probe(struct scmi_device *sdev)
@@ -354,6 +350,7 @@ static int scmi_syspower_probe(struct scmi_device *sdev)
 	sc->required_transition = SCMI_SYSTEM_MAX;
 	sc->userspace_nb.notifier_call = &scmi_userspace_notifier;
 	sc->dev = &sdev->dev;
+	dev_set_drvdata(&sdev->dev, sc);
 
 	INIT_WORK(&sc->suspend_work, scmi_suspend_work_func);
 
@@ -363,6 +360,20 @@ static int scmi_syspower_probe(struct scmi_device *sdev)
 						       NULL, &sc->userspace_nb);
 }
 
+static int scmi_system_power_resume(struct device *dev)
+{
+
+	struct scmi_syspower_conf *sc = dev_get_drvdata(dev);
+
+	sc->state = SCMI_SYSPOWER_IDLE;
+
+	return 0;
+}
+
+static const struct dev_pm_ops scmi_system_power_pmops = {
+	SET_SYSTEM_SLEEP_PM_OPS(NULL, scmi_system_power_resume)
+};
+
 static const struct scmi_device_id scmi_id_table[] = {
 	{ SCMI_PROTOCOL_SYSTEM, "syspower" },
 	{ },
@@ -370,6 +381,9 @@ static const struct scmi_device_id scmi_id_table[] = {
 MODULE_DEVICE_TABLE(scmi, scmi_id_table);
 
 static struct scmi_driver scmi_system_power_driver = {
+	.driver	= {
+		.pm = &scmi_system_power_pmops,
+	},
 	.name = "scmi-system-power",
 	.probe = scmi_syspower_probe,
 	.id_table = scmi_id_table,

-- 
2.37.1
Re: [PATCH 2/2] firmware: arm_scmi: power_control: Set SCMI_SYSPOWER_IDLE in pm resume
Posted by Dhruva Gole 3 months, 2 weeks ago
On Jun 20, 2025 at 11:37:14 +0800, Peng Fan (OSS) wrote:
> From: Peng Fan <peng.fan@nxp.com>
> 
> When two consecutive suspend message send to the Linux agent, Linux will
> suspend and wake up. The exepcted behaviour should be suspend, wake up

I am first trying to gather more context of the issue at hand here,
Why and who is sending 2 consecutive suspend messages to Linux?

Just quoting the cover letter:

> When testing on i.MX95, two consecutive suspend message send to the Linux
> agent, Linux will suspend(by the 1st suspend message) and wake up(by the
> 2nd suspend message).
> 
> The ARM SCMI spec does not allow for filtering of which messages an agent
> wants to get on the system power protocol. To i.MX95, as we use mailbox
> to receive message, and the mailbox supports wake up, so linux will also
> get a repeated suspend message. This will cause Linux to wake (and should
> then go back into suspend).

When you say mailbox supports wake up you mean the mailbox IP in your
SoC actually gets some sort of wake interrupt that triggers a wakeup?
Is this wakeup sent to the SM then to be processed further and trigger a
linux wakeup?

<or> the mailbox directly wakes up linux, ie. triggers a resume flow but
then you are saying it was an unintentional wakeup so you want to
suspend linux again? This just seems like the wakeup routing is
incorrect and the system is going through a who resume and then suspend
cycle without a good reason?

Why and when in this flow is linux ending up with a duplicate suspend message is
something I still don't follow.

Could you point us to any flow diagrams or software sequences that we
could review?

> and suspend again.
> 
> The ARM SCMI spec does not allow for filtering of which messages an agent
> wants to get on the system power protocol. To i.MX95, as we use mailbox
> to receive message, and the mailbox supports wake up, so linux will also
> get a repeated suspend message. This will cause Linux to wake (and should
> then go back into suspend).
> 
> In current driver, the state is set back to SCMI_SYSPOWER_IDLE after
> pm_suspend finish, however the workqueue could be scheduled after
> thaw_kernel_threads. So the 2nd suspend will return early with
> "Transition already in progress...ignore", and leave Linux in wakeup
> state.
> 
> So set SCMI_SYSPOWER_IDLE in device resume phase before workqueue
> is scheduled to make the 2nd suspend message could suspend Linux again.
> 
> Signed-off-by: Peng Fan <peng.fan@nxp.com>
> ---
>  drivers/firmware/arm_scmi/scmi_power_control.c | 24 +++++++++++++++++++-----
>  1 file changed, 19 insertions(+), 5 deletions(-)
> 
[...]

-- 
Best regards,
Dhruva Gole
Texas Instruments Incorporated
Re: [PATCH 2/2] firmware: arm_scmi: power_control: Set SCMI_SYSPOWER_IDLE in pm resume
Posted by Peng Fan 3 months, 2 weeks ago
On Mon, Jun 23, 2025 at 06:27:50PM +0530, Dhruva Gole wrote:
>On Jun 20, 2025 at 11:37:14 +0800, Peng Fan (OSS) wrote:
>> From: Peng Fan <peng.fan@nxp.com>
>> 
>> When two consecutive suspend message send to the Linux agent, Linux will
>> suspend and wake up. The exepcted behaviour should be suspend, wake up
>
>I am first trying to gather more context of the issue at hand here,
>Why and who is sending 2 consecutive suspend messages to Linux?

Currently in my test, it is SCMI platform send two suspend messages.
But in real cases, other high priviledge agents could send suspend messages
to linux agent.

One agent may wrongly send two suspend messages by user or the agent is hacked.

>
>Just quoting the cover letter:
>
>> When testing on i.MX95, two consecutive suspend message send to the Linux
>> agent, Linux will suspend(by the 1st suspend message) and wake up(by the
>> 2nd suspend message).
>> 
>> The ARM SCMI spec does not allow for filtering of which messages an agent
>> wants to get on the system power protocol. To i.MX95, as we use mailbox
>> to receive message, and the mailbox supports wake up, so linux will also
>> get a repeated suspend message. This will cause Linux to wake (and should
>> then go back into suspend).
>
>When you say mailbox supports wake up you mean the mailbox IP in your
>SoC actually gets some sort of wake interrupt that triggers a wakeup?

There is no dedicated wake interrupt  for mailbox.

The interrupt is the doorbell for processing notification, and this
interrupt could also wakeup Linux.

>Is this wakeup sent to the SM then to be processed further and trigger a
>linux wakeup?

No. As above, the mailbox received a doorbell notification interrupt.

>
><or> the mailbox directly wakes up linux, ie. triggers a resume flow but
>then you are saying it was an unintentional wakeup so you want to
>suspend linux again?

Right.

This just seems like the wakeup routing is
>incorrect and the system is going through a who resume and then suspend
>cycle without a good reason?
>
>Why and when in this flow is linux ending up with a duplicate suspend message is
>something I still don't follow.

Other agents could send duplicated suspend messages, right?
We could not expect other agents always behave correctly.

>
>Could you point us to any flow diagrams or software sequences that we
>could review?

Not sure what kind diagram or sequences you wanna. It is just one agent
wrongly send duplicate suspend message to Linux agent. And Linux agent
should suspend again.

One more example is
Linux suspended, other agent send reboot linux message, Linux should
wakeup and reboot itself.

Same to suspend
Linux suspended, other agent send suspend Linux message, Linux wakeup
and suspend again.

Regards,
Peng
>
>> and suspend again.
>> 
>> The ARM SCMI spec does not allow for filtering of which messages an agent
>> wants to get on the system power protocol. To i.MX95, as we use mailbox
>> to receive message, and the mailbox supports wake up, so linux will also
>> get a repeated suspend message. This will cause Linux to wake (and should
>> then go back into suspend).
>> 
>> In current driver, the state is set back to SCMI_SYSPOWER_IDLE after
>> pm_suspend finish, however the workqueue could be scheduled after
>> thaw_kernel_threads. So the 2nd suspend will return early with
>> "Transition already in progress...ignore", and leave Linux in wakeup
>> state.
>> 
>> So set SCMI_SYSPOWER_IDLE in device resume phase before workqueue
>> is scheduled to make the 2nd suspend message could suspend Linux again.
>> 
>> Signed-off-by: Peng Fan <peng.fan@nxp.com>
>> ---
>>  drivers/firmware/arm_scmi/scmi_power_control.c | 24 +++++++++++++++++++-----
>>  1 file changed, 19 insertions(+), 5 deletions(-)
>> 
>[...]
>
>-- 
>Best regards,
>Dhruva Gole
>Texas Instruments Incorporated
Re: [PATCH 2/2] firmware: arm_scmi: power_control: Set SCMI_SYSPOWER_IDLE in pm resume
Posted by Sudeep Holla 3 months, 2 weeks ago
On Mon, Jun 23, 2025 at 10:29:57PM +0800, Peng Fan wrote:
> 
> One more example is
> Linux suspended, other agent send reboot linux message, Linux should
> wakeup and reboot itself.
> 
> Same to suspend
> Linux suspended, other agent send suspend Linux message, Linux wakeup
> and suspend again.
> 

These are very valid requirements and if this is not supported or not
working as expected, it is a BUG in the current implementation.

As lots of details were discussed in private unfortunately, I suggest you
to repost the patch with all the additional information discussed there
for the benefits of all the people following this list or this thread in
particular. It is unfair to not provide full context on the list.

Just to summarise my understanding here at very high level, the issue
exists as the second notification by an agent to the Linux to suspend
the system wakes up the system from suspend state. Since the interrupts
are enabled before the thaw_processes() (which eventually continues the
execution of scmi_suspend_work_func() to set the state to SCMI_SYSPOWER_IDLE,
the scmi_userspace_notifier() is executed much before and ends up ignoring
the request as the state is still not set to SCMI_SYSPOWER_IDLE. There is
a race which your patch is addressing.

-- 
Regards,
Sudeep
Re: [PATCH 2/2] firmware: arm_scmi: power_control: Set SCMI_SYSPOWER_IDLE in pm resume
Posted by Cristian Marussi 3 months, 2 weeks ago
On Mon, Jun 23, 2025 at 10:29:57PM +0800, Peng Fan wrote:
> On Mon, Jun 23, 2025 at 06:27:50PM +0530, Dhruva Gole wrote:
> >On Jun 20, 2025 at 11:37:14 +0800, Peng Fan (OSS) wrote:
> >> From: Peng Fan <peng.fan@nxp.com>
> >> 
> >> When two consecutive suspend message send to the Linux agent, Linux will
> >> suspend and wake up. The exepcted behaviour should be suspend, wake up
> >
> >I am first trying to gather more context of the issue at hand here,
> >Why and who is sending 2 consecutive suspend messages to Linux?
> 
> Currently in my test, it is SCMI platform send two suspend messages.
> But in real cases, other high priviledge agents could send suspend messages
> to linux agent.

Dont really understand this...a high-privileged supervisor agent would
anyway send a suspend/shutdown request to the SCMI server which in turn
should be able to filter out such spurious requests...or such suspend
request from the supervisor to the agent comes through other non-SCMI
means ?

> 
> One agent may wrongly send two suspend messages by user or the agent is hacked.
>

An agent is NOT capable to send direct notification to another agent...
....notifcation are sent via the P2A channels which means that the server is
in charge to send notifs...other agents can cause notifs to be sent NOT
send them directly...so that the server can filter-out any hacked
request... 

> >
> >Just quoting the cover letter:
> >
> >> When testing on i.MX95, two consecutive suspend message send to the Linux
> >> agent, Linux will suspend(by the 1st suspend message) and wake up(by the
> >> 2nd suspend message).
> >> 
> >> The ARM SCMI spec does not allow for filtering of which messages an agent
> >> wants to get on the system power protocol. To i.MX95, as we use mailbox
> >> to receive message, and the mailbox supports wake up, so linux will also
> >> get a repeated suspend message. This will cause Linux to wake (and should
> >> then go back into suspend).
> >
> >When you say mailbox supports wake up you mean the mailbox IP in your
> >SoC actually gets some sort of wake interrupt that triggers a wakeup?
> 
> There is no dedicated wake interrupt  for mailbox.
> 
> The interrupt is the doorbell for processing notification, and this
> interrupt could also wakeup Linux.
> 
> >Is this wakeup sent to the SM then to be processed further and trigger a
> >linux wakeup?
> 
> No. As above, the mailbox received a doorbell notification interrupt.
> 
> >
> ><or> the mailbox directly wakes up linux, ie. triggers a resume flow but
> >then you are saying it was an unintentional wakeup so you want to
> >suspend linux again?
> 
> Right.
> 
> This just seems like the wakeup routing is
> >incorrect and the system is going through a who resume and then suspend
> >cycle without a good reason?
> >
> >Why and when in this flow is linux ending up with a duplicate suspend message is
> >something I still don't follow.
> 
> Other agents could send duplicated suspend messages, right?
> We could not expect other agents always behave correctly.
> 

Absolutely, BUT SCMI is a client/server system and the server is in
charge to filter-out such requests, since each agent has its own dedicated
channel and it is identified by the server as agent_X with capabilities_X
from the channel it speaks from (i.e. an agent cannot spoof its identify)

...and the server is the ultimate judge/aribter of any request so that
it can drop any unreasonable request...we should NOT delegate such
self-protection mechanisms to the agents...

> >
> >Could you point us to any flow diagrams or software sequences that we
> >could review?
> 
> Not sure what kind diagram or sequences you wanna. It is just one agent
> wrongly send duplicate suspend message to Linux agent. And Linux agent
> should suspend again.
> 
> One more example is
> Linux suspended, other agent send reboot linux message, Linux should
> wakeup and reboot itself.

Yes...another privileged agent request a Reboot for agent_X (SYSPOWER_STATE+_SET)
to the server and the server in turn sends a Reboot notification to the
suspended agent_X , which is woken up by the notification and proceeds
with a graceful shutdown/reboot...if this does NOT happen it is
definitely a bug..

> 
> Same to suspend
> Linux suspended, other agent send suspend Linux message, Linux wakeup
> and suspend again.

In theory yes, it should work like this, BUT better if the 2nd message
never come (as explained above)...if this happens, I would say log this
as a warning too because it is not a normal scenario...i.e. if you
receive multuple suspend to the same agent from the same server...
...something is wrong...I agree Linux should survive (and suspend back)
BUT should not be allowed at first (filtered-out) 

Thanks,
Cristian
Re: [PATCH 2/2] firmware: arm_scmi: power_control: Set SCMI_SYSPOWER_IDLE in pm resume
Posted by kernel test robot 3 months, 2 weeks ago
Hi Peng,

kernel test robot noticed the following build warnings:

[auto build test WARNING on 4325743c7e209ae7845293679a4de94b969f2bef]

url:    https://github.com/intel-lab-lkp/linux/commits/Peng-Fan-OSS/firmware-arm_scmi-bus-Add-pm-ops/20250620-114042
base:   4325743c7e209ae7845293679a4de94b969f2bef
patch link:    https://lore.kernel.org/r/20250620-scmi-pm-v1-2-c2f02cae5122%40nxp.com
patch subject: [PATCH 2/2] firmware: arm_scmi: power_control: Set SCMI_SYSPOWER_IDLE in pm resume
config: arm-randconfig-002-20250621 (https://download.01.org/0day-ci/archive/20250621/202506210114.2Ix0TkA0-lkp@intel.com/config)
compiler: arm-linux-gnueabi-gcc (GCC) 15.1.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250621/202506210114.2Ix0TkA0-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202506210114.2Ix0TkA0-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> drivers/firmware/arm_scmi/scmi_power_control.c:363:12: warning: 'scmi_system_power_resume' defined but not used [-Wunused-function]
     363 | static int scmi_system_power_resume(struct device *dev)
         |            ^~~~~~~~~~~~~~~~~~~~~~~~


vim +/scmi_system_power_resume +363 drivers/firmware/arm_scmi/scmi_power_control.c

   362	
 > 363	static int scmi_system_power_resume(struct device *dev)
   364	{
   365	
   366		struct scmi_syspower_conf *sc = dev_get_drvdata(dev);
   367	
   368		sc->state = SCMI_SYSPOWER_IDLE;
   369	
   370		return 0;
   371	}
   372	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki