From: David Woodhouse <dwmw@amazon.co.uk>
The PSCI v1.3 specification (alpha) adds support for a SYSTEM_OFF2
function which is analogous to ACPI S4 state. This will allow hosting
environments to determine that a guest is hibernated rather than just
powered off, and handle that state appropriately on subsequent launches.
Since commit 60c0d45a7f7a ("efi/arm64: use UEFI for system reset and
poweroff") the EFI shutdown method is deliberately preferred over PSCI
or other methods. So register a SYS_OFF_MODE_POWER_OFF handler which
*only* handles the hibernation, leaving the original PSCI SYSTEM_OFF as
a last resort via the legacy pm_power_off function pointer.
The hibernation code already exports a system_entering_hibernation()
function which is be used by the higher-priority handler to check for
hibernation. That existing function just returns the value of a static
boolean variable from hibernate.c, which was previously only set in the
hibernation_platform_enter() code path. Set the same flag in the simpler
code path around the call to kernel_power_off() too.
An alternative way to hook SYSTEM_OFF2 into the hibernation code would
be to register a platform_hibernation_ops structure with an ->enter()
method which makes the new SYSTEM_OFF2 call. But that would have the
unwanted side-effect of making hibernation take a completely different
code path in hibernation_platform_enter(), invoking a lot of special dpm
callbacks.
Another option might be to add a new SYS_OFF_MODE_HIBERNATE mode, with
fallback to SYS_OFF_MODE_POWER_OFF. Or to use the sys_off_data to
indicate whether the power off is for hibernation.
But this version works and is relatively simple.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
drivers/firmware/psci/psci.c | 35 +++++++++++++++++++++++++++++++++++
kernel/power/hibernate.c | 5 ++++-
2 files changed, 39 insertions(+), 1 deletion(-)
diff --git a/drivers/firmware/psci/psci.c b/drivers/firmware/psci/psci.c
index d9629ff87861..69d2f6969438 100644
--- a/drivers/firmware/psci/psci.c
+++ b/drivers/firmware/psci/psci.c
@@ -78,6 +78,7 @@ struct psci_0_1_function_ids get_psci_0_1_function_ids(void)
static u32 psci_cpu_suspend_feature;
static bool psci_system_reset2_supported;
+static bool psci_system_off2_supported;
static inline bool psci_has_ext_power_state(void)
{
@@ -333,6 +334,28 @@ static void psci_sys_poweroff(void)
invoke_psci_fn(PSCI_0_2_FN_SYSTEM_OFF, 0, 0, 0);
}
+#ifdef CONFIG_HIBERNATION
+static int psci_sys_hibernate(struct sys_off_data *data)
+{
+ if (system_entering_hibernation())
+ invoke_psci_fn(PSCI_FN_NATIVE(1_3, SYSTEM_OFF2),
+ PSCI_1_3_HIBERNATE_TYPE_OFF, 0, 0);
+ return NOTIFY_DONE;
+}
+
+static int __init psci_hibernate_init(void)
+{
+ if (psci_system_off2_supported) {
+ /* Higher priority than EFI shutdown, but only for hibernate */
+ register_sys_off_handler(SYS_OFF_MODE_POWER_OFF,
+ SYS_OFF_PRIO_FIRMWARE + 2,
+ psci_sys_hibernate, NULL);
+ }
+ return 0;
+}
+subsys_initcall(psci_hibernate_init);
+#endif
+
static int psci_features(u32 psci_func_id)
{
return invoke_psci_fn(PSCI_1_0_FN_PSCI_FEATURES,
@@ -364,6 +387,7 @@ static const struct {
PSCI_ID_NATIVE(1_1, SYSTEM_RESET2),
PSCI_ID(1_1, MEM_PROTECT),
PSCI_ID_NATIVE(1_1, MEM_PROTECT_CHECK_RANGE),
+ PSCI_ID_NATIVE(1_3, SYSTEM_OFF2),
};
static int psci_debugfs_read(struct seq_file *s, void *data)
@@ -523,6 +547,16 @@ static void __init psci_init_system_reset2(void)
psci_system_reset2_supported = true;
}
+static void __init psci_init_system_off2(void)
+{
+ int ret;
+
+ ret = psci_features(PSCI_FN_NATIVE(1_3, SYSTEM_OFF2));
+
+ if (ret != PSCI_RET_NOT_SUPPORTED)
+ psci_system_off2_supported = true;
+}
+
static void __init psci_init_system_suspend(void)
{
int ret;
@@ -653,6 +687,7 @@ static int __init psci_probe(void)
psci_init_cpu_suspend();
psci_init_system_suspend();
psci_init_system_reset2();
+ psci_init_system_off2();
kvm_init_hyp_services();
}
diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
index 4b0b7cf2e019..ac87b3cb670c 100644
--- a/kernel/power/hibernate.c
+++ b/kernel/power/hibernate.c
@@ -676,8 +676,11 @@ static void power_down(void)
}
fallthrough;
case HIBERNATION_SHUTDOWN:
- if (kernel_can_power_off())
+ if (kernel_can_power_off()) {
+ entering_platform_hibernation = true;
kernel_power_off();
+ entering_platform_hibernation = false;
+ }
break;
}
kernel_halt();
--
2.44.0
On Tue, Mar 12, 2024 at 01:51:29PM +0000, David Woodhouse wrote:
> From: David Woodhouse <dwmw@amazon.co.uk>
>
> The PSCI v1.3 specification (alpha) adds support for a SYSTEM_OFF2
> function which is analogous to ACPI S4 state. This will allow hosting
> environments to determine that a guest is hibernated rather than just
> powered off, and handle that state appropriately on subsequent launches.
>
> Since commit 60c0d45a7f7a ("efi/arm64: use UEFI for system reset and
> poweroff") the EFI shutdown method is deliberately preferred over PSCI
> or other methods. So register a SYS_OFF_MODE_POWER_OFF handler which
> *only* handles the hibernation, leaving the original PSCI SYSTEM_OFF as
> a last resort via the legacy pm_power_off function pointer.
>
> The hibernation code already exports a system_entering_hibernation()
> function which is be used by the higher-priority handler to check for
> hibernation. That existing function just returns the value of a static
> boolean variable from hibernate.c, which was previously only set in the
> hibernation_platform_enter() code path. Set the same flag in the simpler
> code path around the call to kernel_power_off() too.
>
> An alternative way to hook SYSTEM_OFF2 into the hibernation code would
> be to register a platform_hibernation_ops structure with an ->enter()
> method which makes the new SYSTEM_OFF2 call. But that would have the
> unwanted side-effect of making hibernation take a completely different
> code path in hibernation_platform_enter(), invoking a lot of special dpm
> callbacks.
>
> Another option might be to add a new SYS_OFF_MODE_HIBERNATE mode, with
> fallback to SYS_OFF_MODE_POWER_OFF. Or to use the sys_off_data to
> indicate whether the power off is for hibernation.
>
> But this version works and is relatively simple.
>
> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> ---
> drivers/firmware/psci/psci.c | 35 +++++++++++++++++++++++++++++++++++
> kernel/power/hibernate.c | 5 ++++-
> 2 files changed, 39 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/firmware/psci/psci.c b/drivers/firmware/psci/psci.c
> index d9629ff87861..69d2f6969438 100644
> --- a/drivers/firmware/psci/psci.c
> +++ b/drivers/firmware/psci/psci.c
> @@ -78,6 +78,7 @@ struct psci_0_1_function_ids get_psci_0_1_function_ids(void)
>
> static u32 psci_cpu_suspend_feature;
> static bool psci_system_reset2_supported;
> +static bool psci_system_off2_supported;
>
> static inline bool psci_has_ext_power_state(void)
> {
> @@ -333,6 +334,28 @@ static void psci_sys_poweroff(void)
> invoke_psci_fn(PSCI_0_2_FN_SYSTEM_OFF, 0, 0, 0);
> }
>
> +#ifdef CONFIG_HIBERNATION
> +static int psci_sys_hibernate(struct sys_off_data *data)
> +{
> + if (system_entering_hibernation())
> + invoke_psci_fn(PSCI_FN_NATIVE(1_3, SYSTEM_OFF2),
> + PSCI_1_3_HIBERNATE_TYPE_OFF, 0, 0);
> + return NOTIFY_DONE;
> +}
> +
> +static int __init psci_hibernate_init(void)
> +{
> + if (psci_system_off2_supported) {
> + /* Higher priority than EFI shutdown, but only for hibernate */
> + register_sys_off_handler(SYS_OFF_MODE_POWER_OFF,
> + SYS_OFF_PRIO_FIRMWARE + 2,
> + psci_sys_hibernate, NULL);
> + }
> + return 0;
> +}
> +subsys_initcall(psci_hibernate_init);
Looked briefly at register_sys_off_handler and it should be OK to call
it from psci_init_system_off2() below. Any particular reason for having
separate initcall to do this ? We can even eliminate the need for
psci_init_system_off2 if it can be called from there. What am I missing ?
--
Regards,
Sudeep
On Tue, 2024-03-12 at 15:57 +0000, Sudeep Holla wrote: > Looked briefly at register_sys_off_handler and it should be OK to call > it from psci_init_system_off2() below. Any particular reason for having > separate initcall to do this ? We can even eliminate the need for > psci_init_system_off2 if it can be called from there. What am I missing ? My first attempt did that. I don't think we can kmalloc that early: [ 0.000000] psci: SMC Calling Convention v1.1 [ 0.000000] Unable to handle kernel read from unreadable memory at virtual address 0000000000000018 [ 0.000000] Mem abort info: [ 0.000000] ESR = 0x0000000096000004 [ 0.000000] EC = 0x25: DABT (current EL), IL = 32 bits [ 0.000000] SET = 0, FnV = 0 [ 0.000000] EA = 0, S1PTW = 0 [ 0.000000] FSC = 0x04: level 0 translation fault [ 0.000000] Data abort info: [ 0.000000] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 [ 0.000000] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 0.000000] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 0.000000] [0000000000000018] user address but active_mm is swapper [ 0.000000] Internal error: Oops: 0000000096000004 [#1] SMP [ 0.000000] Modules linked in: [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 6.8.0-rc3+ #30 [ 0.000000] pstate: 804000c5 (Nzcv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 0.000000] pc : kmalloc_trace+0x138/0x340 [ 0.000000] lr : register_sys_off_handler+0x60/0x258 [ 0.000000] sp : ffff8000827d3d10 [ 0.000000] x29: ffff8000827d3d20 x28: 000000005cd7e0ac x27: 0000000000001f3f [ 0.000000] x26: 0000000000000000 x25: ffff8000802bd890 x24: ffff8000802bd890 [ 0.000000] x23: 0000000000000040 x22: 0000000000000dc0 x21: 0000000000000001 [ 0.000000] x20: 0000000000000000 x19: 0000000000000000 x18: 0000000000000006 [ 0.000000] x17: 000000000036fd40 x16: 000000005ec902c0 x15: ffff8000827d37c0 [ 0.000000] x14: 0000000000000000 x13: 312e3176206e6f69 x12: 746e65766e6f4320 [ 0.000000] x11: 00000000ffffdfff x10: ffff8000828cebe0 x9 : ffff80008281ea10 [ 0.000000] x8 : ffff8000827d3d78 x7 : 0000000000000000 x6 : 0000000000000000 [ 0.000000] x5 : 0000000000000000 x4 : ffff8000827e0000 x3 : ffff8000827f41c0 [ 0.000000] x2 : 0000000000000040 x1 : 0000000000000dc0 x0 : 0000000000000000 [ 0.000000] Call trace: [ 0.000000] kmalloc_trace+0x138/0x340 [ 0.000000] register_sys_off_handler+0x60/0x258 [ 0.000000] psci_probe+0x2cc/0x350 [ 0.000000] psci_acpi_init+0x50/0x88 [ 0.000000] setup_arch+0x194/0x278 [ 0.000000] start_kernel+0x7c/0x410 [ 0.000000] __primary_switched+0xb8/0xc8 [ 0.000000] Code: b5000f7a f94003f4 aa1803fe d50320ff (b9401a64) [ 0.000000] ---[ end trace 0000000000000000 ]--- [ 0.000000] Kernel panic - not syncing: Attempted to kill the idle task! [ 0.000000] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---
On Tue, Mar 12, 2024 at 04:36:05PM +0000, David Woodhouse wrote: > On Tue, 2024-03-12 at 15:57 +0000, Sudeep Holla wrote: > > Looked briefly at register_sys_off_handler and it should be OK to call > > it from psci_init_system_off2() below. Any particular reason for having > > separate initcall to do this ? We can even eliminate the need for > > psci_init_system_off2 if it can be called from there. What am I missing ? > > My first attempt did that. I don't think we can kmalloc that early: > That was was initial guess. But a quick hack on my setup and running it on the FVP model didn't complain. I think either I messed up or something else wrong, I must check on some h/w. Anyways sorry for the noise and thanks for the response. -- Regards, Sudeep
On Wed, Mar 13, 2024 at 03:34:44PM +0000, Sudeep Holla wrote: > On Tue, Mar 12, 2024 at 04:36:05PM +0000, David Woodhouse wrote: > > On Tue, 2024-03-12 at 15:57 +0000, Sudeep Holla wrote: > > > Looked briefly at register_sys_off_handler and it should be OK to call > > > it from psci_init_system_off2() below. Any particular reason for having > > > separate initcall to do this ? We can even eliminate the need for > > > psci_init_system_off2 if it can be called from there. What am I missing ? > > > > My first attempt did that. I don't think we can kmalloc that early: > > > > That was was initial guess. But a quick hack on my setup and running it on > the FVP model didn't complain. I think either I messed up or something else > wrong, I must check on some h/w. Anyways sorry for the noise and thanks for > the response. > OK, it was indeed giving -ENOMEM which in my hack didn't get propogated properly 🙁. I assume you have some configs that is resulting in the crash instead of -ENOMEM as I see in my setup(FVP as well as hardware). Sorry for the noise. -- Regards, Sudeep
On 14 March 2024 12:09:11 CET, Sudeep Holla <sudeep.holla@arm.com> wrote: >On Wed, Mar 13, 2024 at 03:34:44PM +0000, Sudeep Holla wrote: >> On Tue, Mar 12, 2024 at 04:36:05PM +0000, David Woodhouse wrote: >> > On Tue, 2024-03-12 at 15:57 +0000, Sudeep Holla wrote: >> > > Looked briefly at register_sys_off_handler and it should be OK to call >> > > it from psci_init_system_off2() below. Any particular reason for having >> > > separate initcall to do this ? We can even eliminate the need for >> > > psci_init_system_off2 if it can be called from there. What am I missing ? >> > >> > My first attempt did that. I don't think we can kmalloc that early: >> > >> >> That was was initial guess. But a quick hack on my setup and running it on >> the FVP model didn't complain. I think either I messed up or something else >> wrong, I must check on some h/w. Anyways sorry for the noise and thanks for >> the response. >> > >OK, it was indeed giving -ENOMEM which in my hack didn't get propogated >properly 🙁. I assume you have some configs that is resulting in the >crash instead of -ENOMEM as I see in my setup(FVP as well as hardware). > >Sorry for the noise. Fairly stock Fedora config, with a few tweaks. http://david.woodhou.se/arm-hibernate-config I note kmalloc_trace() is in the backtrace.
© 2016 - 2025 Red Hat, Inc.