From: Mirela Simonovic <mirela.simonovic@aggios.com>
Trigger Xen suspend when the hardware domain initiates suspend via
SHUTDOWN_suspend. Redirect system suspend to CPU#0 to ensure the
suspend logic runs on the boot CPU, as required.
Introduce full suspend/resume infrastructure gated by CONFIG_SYSTEM_SUSPEND,
including logic to:
- disable and enable non-boot physical CPUs
- freeze and thaw domains
- suspend and resume the GIC, timer and console
- maintain system state before and after suspend
Extend arch_domain_shutdown to return a status code and integrate
system_suspend into the ARM-specific path. Update stub implementations
and function signatures across other architectures (x86, riscv, ppc)
for consistency.
Remove the restriction in the PSCI interface preventing suspend from the
hardware domain.
Update Kconfig and Makefile to support conditional inclusion of the new
suspend.c implementation.Limit SYSTEM_SUSPEND support to ARM64 with
UNSUPPORTED enabled and LLC_COLORING disabled.
Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
Signed-off-by: Saeed Nowshadi <saeed.nowshadi@xilinx.com>
Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
---
Changes introduced in V4:
- drop code for saving and restoring VCPU context (for more information
refer part 1 of patch series)
- remove IOMMU suspend and resume calls until they will be implemented
- move system suspend logic to arch_domain_shutdown, invoked from
domain_shutdown
- apply minor fixes such as renaming functions, updating comments, and
modifying the commit message to reflect these changes
- add console_end_sync to resume path after system suspend
Changes introduced in V3:
- merge changes from other commits into this patch (previously stashed):
1) disable/enable non-boot physical CPUs and freeze/thaw domains during
suspend/resume
2) suspend/resume the timer, GIC, console, IOMMU, and hardware domain
---
xen/arch/arm/Kconfig | 1 +
xen/arch/arm/Makefile | 1 +
xen/arch/arm/domain.c | 21 ++++-
xen/arch/arm/include/asm/suspend.h | 22 +++++
xen/arch/arm/suspend.c | 143 +++++++++++++++++++++++++++++
xen/arch/arm/vpsci.c | 4 -
xen/arch/ppc/stubs.c | 2 +-
xen/arch/riscv/stubs.c | 2 +-
xen/arch/x86/domain.c | 4 +-
xen/common/domain.c | 9 +-
xen/include/xen/domain.h | 2 +-
11 files changed, 200 insertions(+), 11 deletions(-)
create mode 100644 xen/arch/arm/include/asm/suspend.h
create mode 100644 xen/arch/arm/suspend.c
diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
index 57919d8b3a..80d0a6bf54 100644
--- a/xen/arch/arm/Kconfig
+++ b/xen/arch/arm/Kconfig
@@ -19,6 +19,7 @@ config ARM
select HAS_DOM0LESS
select HAS_GRANT_CACHE_FLUSH if GRANT_TABLE
select HAS_STACK_PROTECTOR
+ select HAS_SYSTEM_SUSPEND if ARM_64 && !LLC_COLORING && UNSUPPORTED
select HAS_UBSAN
config ARCH_DEFCONFIG
diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index eeeac4e653..381969dd90 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -53,6 +53,7 @@ obj-y += smp.o
obj-y += smpboot.o
obj-$(CONFIG_STATIC_MEMORY) += static-memory.init.o
obj-$(CONFIG_STATIC_SHM) += static-shmem.init.o
+obj-$(CONFIG_SYSTEM_SUSPEND) += suspend.o
obj-y += sysctl.o
obj-y += time.o
obj-y += traps.o
diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index 45aeb8bddc..75092a997f 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -24,11 +24,14 @@
#include <asm/platform.h>
#include <asm/procinfo.h>
#include <asm/regs.h>
+#include <asm/suspend.h>
#include <asm/tee/tee.h>
#include <asm/vfp.h>
#include <asm/vgic.h>
#include <asm/vtimer.h>
+#include <public/sched.h>
+
#include "vpci.h"
#include "vuart.h"
@@ -857,8 +860,24 @@ void arch_domain_destroy(struct domain *d)
domain_io_free(d);
}
-void arch_domain_shutdown(struct domain *d)
+int arch_domain_shutdown(struct domain *d)
{
+ switch ( d->shutdown_code )
+ {
+ case SHUTDOWN_suspend:
+#ifdef CONFIG_SYSTEM_SUSPEND
+ if ( !is_hardware_domain(d) )
+ break;
+
+ return host_system_suspend();
+#else
+ break;
+#endif
+ default:
+ break;
+ }
+
+ return 0;
}
void arch_domain_pause(struct domain *d)
diff --git a/xen/arch/arm/include/asm/suspend.h b/xen/arch/arm/include/asm/suspend.h
new file mode 100644
index 0000000000..78d0e2383b
--- /dev/null
+++ b/xen/arch/arm/include/asm/suspend.h
@@ -0,0 +1,22 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#ifndef __ASM_ARM_SUSPEND_H__
+#define __ASM_ARM_SUSPEND_H__
+
+#ifdef CONFIG_SYSTEM_SUSPEND
+
+int host_system_suspend(void);
+
+#endif /* CONFIG_SYSTEM_SUSPEND */
+
+#endif /* __ASM_ARM_SUSPEND_H__ */
+
+ /*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/arm/suspend.c b/xen/arch/arm/suspend.c
new file mode 100644
index 0000000000..13e59015c0
--- /dev/null
+++ b/xen/arch/arm/suspend.c
@@ -0,0 +1,143 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#include <xen/console.h>
+#include <xen/cpu.h>
+#include <xen/sched.h>
+
+/*
+ * TODO list:
+ * - Test system suspend with LLC_COLORING enabled and verify functionality
+ * - Implement IOMMU suspend/resume handlers and integrate them
+ * into the suspend/resume path
+ * - Add GICv3 suspend/resume support if required
+ * - Enable "xl suspend" support on ARM architecture
+ * - Properly disable Xen timer watchdog from relevant services
+ * - Add suspend/resume CI test for ARM (QEMU if feasible)
+ * - Investigate feasibility and need for implementing system suspend on ARM32
+ */
+
+/* Xen suspend. Note: data is not used (suspend is the suspend to RAM) */
+static long system_suspend(void *data)
+{
+ int status;
+ unsigned long flags;
+
+ BUG_ON(system_state != SYS_STATE_active);
+
+ system_state = SYS_STATE_suspend;
+ freeze_domains();
+ scheduler_disable();
+
+ /*
+ * Non-boot CPUs have to be disabled on suspend and enabled on resume
+ * (hotplug-based mechanism). Disabling non-boot CPUs will lead to PSCI
+ * CPU_OFF to be called by each non-boot CPU. Depending on the underlying
+ * platform capabilities, this may lead to the physical powering down of
+ * CPUs. Tested on Xilinx Zynq Ultrascale+ MPSoC (including power down of
+ * each non-boot CPU).
+ */
+ status = disable_nonboot_cpus();
+ if ( status )
+ {
+ system_state = SYS_STATE_resume;
+ goto resume_nonboot_cpus;
+ }
+
+ time_suspend();
+
+ local_irq_save(flags);
+ status = gic_suspend();
+ if ( status )
+ {
+ system_state = SYS_STATE_resume;
+ goto resume_irqs;
+ }
+
+ printk("Xen suspending...\n");
+
+ console_start_sync();
+ status = console_suspend();
+ if ( status )
+ {
+ dprintk(XENLOG_ERR, "Failed to suspend the console, err=%d\n", status);
+ system_state = SYS_STATE_resume;
+ goto resume_console;
+ }
+
+ /*
+ * Enable identity mapping before entering suspend to simplify
+ * the resume path
+ */
+ update_boot_mapping(true);
+
+ system_state = SYS_STATE_resume;
+ update_boot_mapping(false);
+
+ resume_console:
+ console_resume();
+ console_end_sync();
+
+ gic_resume();
+
+ resume_irqs:
+ local_irq_restore(flags);
+ time_resume();
+
+ resume_nonboot_cpus:
+ /*
+ * The rcu_barrier() has to be added to ensure that the per cpu area is
+ * freed before a non-boot CPU tries to initialize it (_free_percpu_area()
+ * has to be called before the init_percpu_area()). This scenario occurs
+ * when non-boot CPUs are hot-unplugged on suspend and hotplugged on resume.
+ */
+ rcu_barrier();
+ enable_nonboot_cpus();
+ scheduler_enable();
+ thaw_domains();
+
+ system_state = SYS_STATE_active;
+
+ /* The resume of hardware domain should always follow Xen's resume. */
+ domain_resume(hardware_domain);
+
+ printk("Resume (status %d)\n", status);
+ return status;
+}
+
+int host_system_suspend(void)
+{
+ int status;
+
+ /*
+ * system_suspend should be called when Dom0 finalizes the suspend
+ * procedure from its boot core (VCPU#0). However, Dom0's VCPU#0 could
+ * be mapped to any PCPU (this function could be executed by any PCPU).
+ * The suspend procedure has to be finalized by the PCPU#0 (non-boot
+ * PCPUs will be disabled during the suspend).
+ */
+ status = continue_hypercall_on_cpu(0, system_suspend, NULL);
+
+ /*
+ * If an error happened, there is nothing that needs to be done here
+ * because the system_suspend always returns in fully functional state
+ * no matter what the outcome of suspend procedure is. If the system
+ * suspended successfully the function will return 0 after the resume.
+ * Otherwise, if an error is returned it means Xen did not suspended,
+ * but it is still in the same state as if the system_suspend was never
+ * called. We dump a debug message in case of an error for debugging/
+ * logging purpose.
+ */
+ if ( status )
+ dprintk(XENLOG_ERR, "Failed to suspend, errno=%d\n", status);
+
+ return status;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/arm/vpsci.c b/xen/arch/arm/vpsci.c
index 87a63b0a94..27e413b1fb 100644
--- a/xen/arch/arm/vpsci.c
+++ b/xen/arch/arm/vpsci.c
@@ -202,10 +202,6 @@ static int32_t do_psci_1_0_system_suspend(register_t epoint, register_t cid)
struct vcpu *v;
struct domain *d = current->domain;
- /* Drop this check once SYSTEM_SUSPEND is supported in hardware domain */
- if ( is_hardware_domain(d) )
- return PSCI_NOT_SUPPORTED;
-
/* Ensure that all CPUs other than the calling one are offline */
for_each_vcpu ( d, v )
{
diff --git a/xen/arch/ppc/stubs.c b/xen/arch/ppc/stubs.c
index 671e71aa0a..5bb08dcd95 100644
--- a/xen/arch/ppc/stubs.c
+++ b/xen/arch/ppc/stubs.c
@@ -219,7 +219,7 @@ void arch_domain_destroy(struct domain *d)
BUG_ON("unimplemented");
}
-void arch_domain_shutdown(struct domain *d)
+int arch_domain_shutdown(struct domain *d)
{
BUG_ON("unimplemented");
}
diff --git a/xen/arch/riscv/stubs.c b/xen/arch/riscv/stubs.c
index 83416d3350..e2cb6c2579 100644
--- a/xen/arch/riscv/stubs.c
+++ b/xen/arch/riscv/stubs.c
@@ -204,7 +204,7 @@ void arch_domain_destroy(struct domain *d)
BUG_ON("unimplemented");
}
-void arch_domain_shutdown(struct domain *d)
+int arch_domain_shutdown(struct domain *d)
{
BUG_ON("unimplemented");
}
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 7536b6c871..ab4b54fc3c 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -951,10 +951,12 @@ void arch_domain_destroy(struct domain *d)
psr_domain_free(d);
}
-void arch_domain_shutdown(struct domain *d)
+int arch_domain_shutdown(struct domain *d)
{
if ( is_viridian_domain(d) )
viridian_time_domain_freeze(d);
+
+ return 0;
}
void arch_domain_pause(struct domain *d)
diff --git a/xen/common/domain.c b/xen/common/domain.c
index 58c5ffc466..6a3140561e 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -1276,6 +1276,7 @@ void __domain_crash(struct domain *d)
int domain_shutdown(struct domain *d, u8 reason)
{
struct vcpu *v;
+ int ret;
#ifdef CONFIG_X86
if ( pv_shim )
@@ -1288,7 +1289,11 @@ int domain_shutdown(struct domain *d, u8 reason)
d->shutdown_code = reason;
reason = d->shutdown_code;
+#if defined(CONFIG_SYSTEM_SUSPEND) && defined(CONFIG_ARM)
+ if ( reason != SHUTDOWN_suspend && is_hardware_domain(d) )
+#else
if ( is_hardware_domain(d) )
+#endif
hwdom_shutdown(reason);
if ( d->is_shutting_down )
@@ -1311,13 +1316,13 @@ int domain_shutdown(struct domain *d, u8 reason)
v->paused_for_shutdown = 1;
}
- arch_domain_shutdown(d);
+ ret = arch_domain_shutdown(d);
__domain_finalise_shutdown(d);
spin_unlock(&d->shutdown_lock);
- return 0;
+ return ret;
}
void domain_resume(struct domain *d)
diff --git a/xen/include/xen/domain.h b/xen/include/xen/domain.h
index e10baf2615..c0cbbd808f 100644
--- a/xen/include/xen/domain.h
+++ b/xen/include/xen/domain.h
@@ -101,7 +101,7 @@ int arch_domain_create(struct domain *d,
int arch_domain_teardown(struct domain *d);
void arch_domain_destroy(struct domain *d);
-void arch_domain_shutdown(struct domain *d);
+int arch_domain_shutdown(struct domain *d);
void arch_domain_pause(struct domain *d);
void arch_domain_unpause(struct domain *d);
--
2.48.1
On 02.06.2025 11:04, Mykola Kvach wrote:
> @@ -857,8 +860,24 @@ void arch_domain_destroy(struct domain *d)
> domain_io_free(d);
> }
>
> -void arch_domain_shutdown(struct domain *d)
> +int arch_domain_shutdown(struct domain *d)
> {
> + switch ( d->shutdown_code )
> + {
> + case SHUTDOWN_suspend:
> +#ifdef CONFIG_SYSTEM_SUSPEND
> + if ( !is_hardware_domain(d) )
> + break;
> +
> + return host_system_suspend();
> +#else
> + break;
> +#endif
> + default:
> + break;
> + }
> +
> + return 0;
> }
What's wrong with
int arch_domain_shutdown(struct domain *d)
{
switch ( d->shutdown_code )
{
#ifdef CONFIG_SYSTEM_SUSPEND
case SHUTDOWN_suspend:
if ( !is_hardware_domain(d) )
break;
return host_system_suspend();
#endif
default:
break;
}
return 0;
}
?
> @@ -1311,13 +1316,13 @@ int domain_shutdown(struct domain *d, u8 reason)
> v->paused_for_shutdown = 1;
> }
>
> - arch_domain_shutdown(d);
> + ret = arch_domain_shutdown(d);
If non-zero comes back here, is ...
> __domain_finalise_shutdown(d);
... this still appropriate to call?
> spin_unlock(&d->shutdown_lock);
>
> - return 0;
> + return ret;
> }
Most callers don't care about the return value of this function. That likely
needs to change, even if _for now_ you only alter the SHUTDOWN_suspend case
(and hence some of the callers aren't immediately impacted)?
Jan
Hi, @Jan Beulich
On Mon, Jun 2, 2025 at 12:26 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 02.06.2025 11:04, Mykola Kvach wrote:
> > @@ -857,8 +860,24 @@ void arch_domain_destroy(struct domain *d)
> > domain_io_free(d);
> > }
> >
> > -void arch_domain_shutdown(struct domain *d)
> > +int arch_domain_shutdown(struct domain *d)
> > {
> > + switch ( d->shutdown_code )
> > + {
> > + case SHUTDOWN_suspend:
> > +#ifdef CONFIG_SYSTEM_SUSPEND
> > + if ( !is_hardware_domain(d) )
> > + break;
> > +
> > + return host_system_suspend();
> > +#else
> > + break;
> > +#endif
> > + default:
> > + break;
> > + }
> > +
> > + return 0;
> > }
>
> What's wrong with
>
> int arch_domain_shutdown(struct domain *d)
> {
> switch ( d->shutdown_code )
> {
> #ifdef CONFIG_SYSTEM_SUSPEND
> case SHUTDOWN_suspend:
> if ( !is_hardware_domain(d) )
> break;
>
> return host_system_suspend();
> #endif
>
> default:
> break;
> }
>
> return 0;
> }
>
> ?
You're right — your version is cleaner and avoids unnecessary
preprocessor spaghetti.
My original intention was to keep the SHUTDOWN_suspend case visible
even when CONFIG_SYSTEM_SUSPEND is disabled, so future changes
wouldn't require reintroducing the case label. But thinking about it
again, if future support is needed, the code can be easily adjusted at
that time.
I'll switch to your suggested version — it's simpler and more readable.
>
> > @@ -1311,13 +1316,13 @@ int domain_shutdown(struct domain *d, u8 reason)
> > v->paused_for_shutdown = 1;
> > }
> >
> > - arch_domain_shutdown(d);
> > + ret = arch_domain_shutdown(d);
>
> If non-zero comes back here, is ...
>
> > __domain_finalise_shutdown(d);
>
> ... this still appropriate to call?
Thank you for catching that — you’re absolutely right, that was my oversight.
If arch_domain_shutdown returns a non-zero value, we need to undo
any previous actions performed before the call to properly restore the
domain’s state. Calling __domain_finalise_shutdown unconditionally
in that case could lead to an inconsistent state.
I will update the code accordingly to ensure proper cleanup and state
restoration.
>
> > spin_unlock(&d->shutdown_lock);
> >
> > - return 0;
> > + return ret;
> > }
>
> Most callers don't care about the return value of this function. That likely
> needs to change, even if _for now_ you only alter the SHUTDOWN_suspend case
> (and hence some of the callers aren't immediately impacted)?
Thanks for pointing this out. You’re right that most callers currently
ignore the return value of domain_shutdown(). This will need to be
updated to properly handle non-zero returns, at least for the
SHUTDOWN_suspend case where it matters.
Even if some callers aren’t immediately impacted, I agree it’s better
to fix this proactively to avoid silent failures or inconsistent states.
I’ll start reviewing the callers and update them accordingly.
>
> Jan
Thank you for reviewing this patch series.
Best regards,
Mykola
© 2016 - 2025 Red Hat, Inc.