[PATCH v7 1/4] xen/arm: Implement PSCI SYSTEM_SUSPEND call for guests

Mykola Kvach posted 4 patches 3 months ago
There is a newer version of this series
[PATCH v7 1/4] xen/arm: Implement PSCI SYSTEM_SUSPEND call for guests
Posted by Mykola Kvach 3 months ago
From: Mykola Kvach <mykola_kvach@epam.com>

This patch adds support for the PSCI SYSTEM_SUSPEND function in the vPSCI
(virtual PSCI) interface, allowing guests to request suspend via the PSCI
v1.0 SYSTEM_SUSPEND call (both 32-bit and 64-bit variants).

The implementation:
- Adds SYSTEM_SUSPEND function IDs to PSCI definitions
- Implements trapping and handling of SYSTEM_SUSPEND in vPSCI
- Allows only non-hardware domains to invoke SYSTEM_SUSPEND; for the
  hardware domain, PSCI_NOT_SUPPORTED is returned to avoid halting the
  system in hwdom_shutdown() called from domain_shutdown
- Ensures all secondary VCPUs of the calling domain are offline before
  allowing suspend due to PSCI spec

Usage:

For Linux-based guests, suspend can be initiated with:
    echo mem > /sys/power/state
or via:
    systemctl suspend

Resuming the guest is performed from control domain using:
      xl resume <domain>

Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
---
Changes in V7:
- add proper locking
- minor changes after code review

Changes in V6:
- skip execution of ctxt_switch_from for vcpu that is in paused domain
- add implementation of domain_resume without domain_pause
- add helper function to determine if vcpu is suspended or not
- ignore upper 32 bits of argument values when the domain is 64-bit
  and calls the SMC32 SYSTEM_SUSPEND function
- cosmetic changes after review

Changes in V5:
- don't use standby mode, restore execution in a provided by guest point
- move checking that all CPUs, except current one, are offline to after
  pausing the vCPUs
- provide ret status from arch_domain_shutdown and handle it in
  domain_shutdown
- adjust VPSCI_NR_FUNCS to reflect the number of newly added PSCI functions

Changes in V4:
Dropped all changes related to watchdog, domain is marked as shutting
down in domain_shutdown and watchdog timeout handler won't trigger
because of it.

Previous versions included code to manage Xen watchdog timers during suspend,
but this was removed. When a guest OS starts the Xen watchdog (either via the
kernel driver or xenwatchdogd), it is responsible for managing that state
across suspend/resume. On Linux, the Xen kernel driver properly stops the
watchdog during suspend. However, when xenwatchdogd is used instead, suspend
handling is incomplete, potentially leading to watchdog-triggered resets on
resume. Xen leaves watchdog handling to the guest OS and its services.

Dropped all changes related to VCPU context, because instead domain_shutdown
is used, so we don't need any extra changes for suspending domain.

Changes in V3:
Dropped all domain flags and related code (which touched common functions like
vcpu_unblock), keeping only the necessary changes for Xen suspend/resume, i.e.
suspend/resume is now fully supported only for the hardware domain.
Proper support for domU suspend/resume will be added in a future patch.
This patch does not yet include VCPU context reset or domain context
restoration in VCPU.
---
 xen/arch/arm/domain.c                 |   3 +
 xen/arch/arm/include/asm/perfc_defn.h |   1 +
 xen/arch/arm/include/asm/psci.h       |   2 +
 xen/arch/arm/include/asm/vpsci.h      |   2 +-
 xen/arch/arm/vpsci.c                  | 104 ++++++++++++++++++++++----
 xen/common/domain.c                   |  25 +++++--
 xen/include/xen/sched.h               |   7 ++
 7 files changed, 122 insertions(+), 22 deletions(-)

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index 310c578909..451b468755 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -90,6 +90,9 @@ static void ctxt_switch_from(struct vcpu *p)
     if ( is_idle_vcpu(p) )
         return;
 
+    if ( test_bit(_VPF_suspended, &p->pause_flags) )
+        return;
+
     p2m_save_state(p);
 
     /* CP 15 */
diff --git a/xen/arch/arm/include/asm/perfc_defn.h b/xen/arch/arm/include/asm/perfc_defn.h
index effd25b69e..8dfcac7e3b 100644
--- a/xen/arch/arm/include/asm/perfc_defn.h
+++ b/xen/arch/arm/include/asm/perfc_defn.h
@@ -33,6 +33,7 @@ PERFCOUNTER(vpsci_system_reset,        "vpsci: system_reset")
 PERFCOUNTER(vpsci_cpu_suspend,         "vpsci: cpu_suspend")
 PERFCOUNTER(vpsci_cpu_affinity_info,   "vpsci: cpu_affinity_info")
 PERFCOUNTER(vpsci_features,            "vpsci: features")
+PERFCOUNTER(vpsci_system_suspend,      "vpsci: system_suspend")
 
 PERFCOUNTER(vcpu_kick,                 "vcpu: notify other vcpu")
 
diff --git a/xen/arch/arm/include/asm/psci.h b/xen/arch/arm/include/asm/psci.h
index 4780972621..48a93e6b79 100644
--- a/xen/arch/arm/include/asm/psci.h
+++ b/xen/arch/arm/include/asm/psci.h
@@ -47,10 +47,12 @@ void call_psci_system_reset(void);
 #define PSCI_0_2_FN32_SYSTEM_OFF          PSCI_0_2_FN32(8)
 #define PSCI_0_2_FN32_SYSTEM_RESET        PSCI_0_2_FN32(9)
 #define PSCI_1_0_FN32_PSCI_FEATURES       PSCI_0_2_FN32(10)
+#define PSCI_1_0_FN32_SYSTEM_SUSPEND      PSCI_0_2_FN32(14)
 
 #define PSCI_0_2_FN64_CPU_SUSPEND         PSCI_0_2_FN64(1)
 #define PSCI_0_2_FN64_CPU_ON              PSCI_0_2_FN64(3)
 #define PSCI_0_2_FN64_AFFINITY_INFO       PSCI_0_2_FN64(4)
+#define PSCI_1_0_FN64_SYSTEM_SUSPEND      PSCI_0_2_FN64(14)
 
 /* PSCI v0.2 affinity level state returned by AFFINITY_INFO */
 #define PSCI_0_2_AFFINITY_LEVEL_ON      0
diff --git a/xen/arch/arm/include/asm/vpsci.h b/xen/arch/arm/include/asm/vpsci.h
index 0cca5e6830..69d40f9d7f 100644
--- a/xen/arch/arm/include/asm/vpsci.h
+++ b/xen/arch/arm/include/asm/vpsci.h
@@ -23,7 +23,7 @@
 #include <asm/psci.h>
 
 /* Number of function implemented by virtual PSCI (only 0.2 or later) */
-#define VPSCI_NR_FUNCS  12
+#define VPSCI_NR_FUNCS  14
 
 /* Functions handle PSCI calls from the guests */
 bool do_vpsci_0_1_call(struct cpu_user_regs *regs, uint32_t fid);
diff --git a/xen/arch/arm/vpsci.c b/xen/arch/arm/vpsci.c
index 7ba9ccd94b..c8f74fd28b 100644
--- a/xen/arch/arm/vpsci.c
+++ b/xen/arch/arm/vpsci.c
@@ -10,28 +10,18 @@
 
 #include <public/sched.h>
 
-static int do_common_cpu_on(register_t target_cpu, register_t entry_point,
-                            register_t context_id)
+static int do_setup_vcpu_ctx(struct vcpu *v, register_t entry_point,
+                      register_t context_id)
 {
-    struct vcpu *v;
     struct domain *d = current->domain;
     struct vcpu_guest_context *ctxt;
     int rc;
     bool is_thumb = entry_point & 1;
-    register_t vcpuid;
-
-    vcpuid = vaffinity_to_vcpuid(target_cpu);
-
-    if ( (v = domain_vcpu(d, vcpuid)) == NULL )
-        return PSCI_INVALID_PARAMETERS;
 
     /* THUMB set is not allowed with 64-bit domain */
     if ( is_64bit_domain(d) && is_thumb )
         return PSCI_INVALID_ADDRESS;
 
-    if ( !test_bit(_VPF_down, &v->pause_flags) )
-        return PSCI_ALREADY_ON;
-
     if ( (ctxt = alloc_vcpu_guest_context()) == NULL )
         return PSCI_DENIED;
 
@@ -78,11 +68,32 @@ static int do_common_cpu_on(register_t target_cpu, register_t entry_point,
     if ( rc < 0 )
         return PSCI_DENIED;
 
-    vcpu_wake(v);
-
     return PSCI_SUCCESS;
 }
 
+static int do_common_cpu_on(register_t target_cpu, register_t entry_point,
+                            register_t context_id)
+{
+    int rc;
+    struct vcpu *v;
+    struct domain *d = current->domain;
+    register_t vcpuid;
+
+    vcpuid = vaffinity_to_vcpuid(target_cpu);
+
+    if ( (v = domain_vcpu(d, vcpuid)) == NULL )
+        return PSCI_INVALID_PARAMETERS;
+
+    if ( !test_bit(_VPF_down, &v->pause_flags) )
+        return PSCI_ALREADY_ON;
+
+    rc = do_setup_vcpu_ctx(v, entry_point, context_id);
+    if ( rc == PSCI_SUCCESS )
+        vcpu_wake(v);
+
+    return rc;
+}
+
 static int32_t do_psci_cpu_on(uint32_t vcpuid, register_t entry_point)
 {
     int32_t ret;
@@ -197,6 +208,51 @@ static void do_psci_0_2_system_reset(void)
     domain_shutdown(d,SHUTDOWN_reboot);
 }
 
+static int32_t do_psci_1_0_system_suspend(register_t epoint, register_t cid)
+{
+    int32_t rc;
+    struct vcpu *v;
+    struct domain *d = current->domain;
+
+    /* SYSTEM_SUSPEND is not supported for the hardware domain yet */
+    if ( is_hardware_domain(d) )
+        return PSCI_NOT_SUPPORTED;
+
+    domain_lock(d);
+
+    /* Ensure that all CPUs other than the calling one are offline */
+    for_each_vcpu ( d, v )
+    {
+        if ( v != current && is_vcpu_online(v) )
+        {
+            rc = PSCI_DENIED;
+            goto exit_unlock;
+        }
+    }
+
+    rc = do_setup_vcpu_ctx(current, epoint, cid);
+    if ( rc != PSCI_SUCCESS )
+        goto exit_unlock;
+
+    rc = domain_shutdown(d, SHUTDOWN_suspend) ?
+         PSCI_INTERNAL_FAILURE : PSCI_SUCCESS;
+    if ( rc != PSCI_SUCCESS )
+    {
+        domain_resume_nopause(d);
+        goto exit_unlock;
+    }
+
+    set_bit(_VPF_suspended, &current->pause_flags);
+
+    dprintk(XENLOG_INFO,
+            "Dom %u: SYSTEM_SUSPEND requested, epoint=%#lx, cid=%#lx\n",
+            d->domain_id, epoint, cid);
+
+ exit_unlock:
+    domain_unlock(d);
+    return rc;
+}
+
 static int32_t do_psci_1_0_features(uint32_t psci_func_id)
 {
     /* /!\ Ordered by function ID and not name */
@@ -214,6 +270,8 @@ static int32_t do_psci_1_0_features(uint32_t psci_func_id)
     case PSCI_0_2_FN32_SYSTEM_OFF:
     case PSCI_0_2_FN32_SYSTEM_RESET:
     case PSCI_1_0_FN32_PSCI_FEATURES:
+    case PSCI_1_0_FN32_SYSTEM_SUSPEND:
+    case PSCI_1_0_FN64_SYSTEM_SUSPEND:
     case ARM_SMCCC_VERSION_FID:
         return 0;
     default:
@@ -344,6 +402,24 @@ bool do_vpsci_0_2_call(struct cpu_user_regs *regs, uint32_t fid)
         return true;
     }
 
+    case PSCI_1_0_FN32_SYSTEM_SUSPEND:
+    case PSCI_1_0_FN64_SYSTEM_SUSPEND:
+    {
+        register_t epoint = PSCI_ARG(regs, 1);
+        register_t cid = PSCI_ARG(regs, 2);
+
+        if ( is_64bit_domain(current->domain) &&
+             fid == PSCI_1_0_FN32_SYSTEM_SUSPEND )
+        {
+            epoint &= GENMASK(31, 0);
+            cid &= GENMASK(31, 0);
+        }
+
+        perfc_incr(vpsci_system_suspend);
+        PSCI_SET_RESULT(regs, do_psci_1_0_system_suspend(epoint, cid));
+        return true;
+    }
+
     default:
         return false;
     }
diff --git a/xen/common/domain.c b/xen/common/domain.c
index 3c65cca5b0..bd698dfc0f 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -1334,16 +1334,15 @@ int domain_shutdown(struct domain *d, u8 reason)
     return 0;
 }
 
-void domain_resume(struct domain *d)
+#ifdef CONFIG_ARM
+void domain_resume_nopause(struct domain *d)
+#else
+static void domain_resume_nopause(struct domain *d)
+#endif
 {
     struct vcpu *v;
 
-    /*
-     * Some code paths assume that shutdown status does not get reset under
-     * their feet (e.g., some assertions make this assumption).
-     */
-    domain_pause(d);
-
+    domain_lock(d);
     spin_lock(&d->shutdown_lock);
 
     d->is_shutting_down = d->is_shut_down = 0;
@@ -1351,13 +1350,25 @@ void domain_resume(struct domain *d)
 
     for_each_vcpu ( d, v )
     {
+        clear_bit(_VPF_suspended, &v->pause_flags);
+
         if ( v->paused_for_shutdown )
             vcpu_unpause(v);
         v->paused_for_shutdown = 0;
     }
 
     spin_unlock(&d->shutdown_lock);
+    domain_unlock(d);
+}
 
+void domain_resume(struct domain *d)
+{
+    /*
+     * Some code paths assume that shutdown status does not get reset under
+     * their feet (e.g., some assertions make this assumption).
+     */
+    domain_pause(d);
+    domain_resume_nopause(d);
     domain_unpause(d);
 }
 
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index fe53d4fab7..5647a15c84 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -814,6 +814,9 @@ void domain_destroy(struct domain *d);
 int domain_kill(struct domain *d);
 int domain_shutdown(struct domain *d, u8 reason);
 void domain_resume(struct domain *d);
+#ifdef CONFIG_ARM
+void domain_resume_nopause(struct domain *d);
+#endif
 
 int domain_soft_reset(struct domain *d, bool resuming);
 
@@ -1010,6 +1013,10 @@ static inline struct domain *next_domain_in_cpupool(
 /* VCPU is parked. */
 #define _VPF_parked          8
 #define VPF_parked           (1UL<<_VPF_parked)
+/* VCPU is suspended. */
+#define _VPF_suspended       9
+#define VPF_suspended        (1UL<<_VPF_suspended)
+
 
 static inline bool vcpu_runnable(const struct vcpu *v)
 {
-- 
2.48.1
Re: [PATCH v7 1/4] xen/arm: Implement PSCI SYSTEM_SUSPEND call for guests
Posted by Jan Beulich 3 months ago
On 29.07.2025 10:52, Mykola Kvach wrote:
> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -1334,16 +1334,15 @@ int domain_shutdown(struct domain *d, u8 reason)
>      return 0;
>  }
>  
> -void domain_resume(struct domain *d)
> +#ifdef CONFIG_ARM
> +void domain_resume_nopause(struct domain *d)
> +#else
> +static void domain_resume_nopause(struct domain *d)
> +#endif

#ifndef CONFIG_ARM
static
#endif
void domain_resume_nopause(struct domain *d)

to have as little redundancy as possible.

>  {
>      struct vcpu *v;
>  
> -    /*
> -     * Some code paths assume that shutdown status does not get reset under
> -     * their feet (e.g., some assertions make this assumption).
> -     */
> -    domain_pause(d);
> -
> +    domain_lock(d);

This addition of locking affects domain_resume() as well. Neither need nor
correctness are discussed in the description. If the locking was really
needed for domain_resume() as well, I suppose adding that would better be
a separate change anyway.

The addition of this locking is particularly interesting considering that
...

>      spin_lock(&d->shutdown_lock);

... is what follows right after.

> --- a/xen/include/xen/sched.h
> +++ b/xen/include/xen/sched.h
> @@ -814,6 +814,9 @@ void domain_destroy(struct domain *d);
>  int domain_kill(struct domain *d);
>  int domain_shutdown(struct domain *d, u8 reason);
>  void domain_resume(struct domain *d);
> +#ifdef CONFIG_ARM
> +void domain_resume_nopause(struct domain *d);
> +#endif
>  
>  int domain_soft_reset(struct domain *d, bool resuming);
>  
> @@ -1010,6 +1013,10 @@ static inline struct domain *next_domain_in_cpupool(
>  /* VCPU is parked. */
>  #define _VPF_parked          8
>  #define VPF_parked           (1UL<<_VPF_parked)
> +/* VCPU is suspended. */
> +#define _VPF_suspended       9
> +#define VPF_suspended        (1UL<<_VPF_suspended)

Irrespective of the style violations in pre-existing code, can you please
not add further violations, by inserting the missing blanks?

> +
>  

Please also don't introduce double blank lines.

Jan
Re: [PATCH v7 1/4] xen/arm: Implement PSCI SYSTEM_SUSPEND call for guests
Posted by Mykola Kvach 2 months, 3 weeks ago
Hi Jan,

On Tue, Jul 29, 2025 at 12:11 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 29.07.2025 10:52, Mykola Kvach wrote:
> > --- a/xen/common/domain.c
> > +++ b/xen/common/domain.c
> > @@ -1334,16 +1334,15 @@ int domain_shutdown(struct domain *d, u8 reason)
> >      return 0;
> >  }
> >
> > -void domain_resume(struct domain *d)
> > +#ifdef CONFIG_ARM
> > +void domain_resume_nopause(struct domain *d)
> > +#else
> > +static void domain_resume_nopause(struct domain *d)
> > +#endif
>
> #ifndef CONFIG_ARM
> static
> #endif
> void domain_resume_nopause(struct domain *d)
>
> to have as little redundancy as possible.

Okay, I’ll change it.

>
> >  {
> >      struct vcpu *v;
> >
> > -    /*
> > -     * Some code paths assume that shutdown status does not get reset under
> > -     * their feet (e.g., some assertions make this assumption).
> > -     */
> > -    domain_pause(d);
> > -
> > +    domain_lock(d);
>
> This addition of locking affects domain_resume() as well. Neither need nor
> correctness are discussed in the description. If the locking was really
> needed for domain_resume() as well, I suppose adding that would better be
> a separate change anyway.

Thanks for the review.

The locking was added to avoid potential races involving _VPF_suspended and
the suspend/resume logic.

Consider the case where domain_lock() is not used inside domain_resume():

Domain 1 initiates suspend via PSCI SYSTEM_SUSPEND. At the same time,
Domain 0 invokes resume for Domain 1.

Domain 0 calls xl resume, which leads to domain_resume(). Domain 1 acquires
domain_lock() as part of the suspend path. Then it acquires the shutdown
lock in domain_shutdown(). Domain 0 is blocked waiting for the shutdown
lock. When Domain 1 releases the shutdown lock, it sets _VPF_suspended and
modifies the VCPU context. Then Domain 0 clears _VPF_suspended.

At this point, ctxt_switch_from() might be called with _VPF_suspended
already cleared, and the VCPU context partially updated. For example, the
guest PC is set to the resume entry point, but some registers like TTBR or
SCTLR_EL1 are saved from the current hardware context by
ctxt_switch_from.

However, after reviewing the flow again, I think this kind of race could
still happen even with the lock in place. Imagine Domain 1 sets the flag
via SYSTEM_SUSPEND, and then Domain 0 clears it by resuming the domain
before the first context switch. This could still result in a partially
updated context with inconsistent state.

So it might be better to update the VCPU context at the point of resume
instead of doing it during suspend. I'll look into that further and also
check for other possible races if the update is moved.

>
> The addition of this locking is particularly interesting considering that
> ...
>
> >      spin_lock(&d->shutdown_lock);
>
> ... is what follows right after.
>
> > --- a/xen/include/xen/sched.h
> > +++ b/xen/include/xen/sched.h
> > @@ -814,6 +814,9 @@ void domain_destroy(struct domain *d);
> >  int domain_kill(struct domain *d);
> >  int domain_shutdown(struct domain *d, u8 reason);
> >  void domain_resume(struct domain *d);
> > +#ifdef CONFIG_ARM
> > +void domain_resume_nopause(struct domain *d);
> > +#endif
> >
> >  int domain_soft_reset(struct domain *d, bool resuming);
> >
> > @@ -1010,6 +1013,10 @@ static inline struct domain *next_domain_in_cpupool(
> >  /* VCPU is parked. */
> >  #define _VPF_parked          8
> >  #define VPF_parked           (1UL<<_VPF_parked)
> > +/* VCPU is suspended. */
> > +#define _VPF_suspended       9
> > +#define VPF_suspended        (1UL<<_VPF_suspended)
>
> Irrespective of the style violations in pre-existing code, can you please
> not add further violations, by inserting the missing blanks?

Okay

>
> > +
> >
>
> Please also don't introduce double blank lines.

I'll remove it.

>
> Jan

Best regards,
Mykola
Re: [PATCH v7 1/4] xen/arm: Implement PSCI SYSTEM_SUSPEND call for guests
Posted by Mykola Kvach 2 months, 3 weeks ago
On Thu, Aug 7, 2025 at 10:38 AM Mykola Kvach <xakep.amatop@gmail.com> wrote:
>
> Hi Jan,
>
> On Tue, Jul 29, 2025 at 12:11 PM Jan Beulich <jbeulich@suse.com> wrote:
> >
> > On 29.07.2025 10:52, Mykola Kvach wrote:
> > > --- a/xen/common/domain.c
> > > +++ b/xen/common/domain.c
> > > @@ -1334,16 +1334,15 @@ int domain_shutdown(struct domain *d, u8 reason)
> > >      return 0;
> > >  }
> > >
> > > -void domain_resume(struct domain *d)
> > > +#ifdef CONFIG_ARM
> > > +void domain_resume_nopause(struct domain *d)
> > > +#else
> > > +static void domain_resume_nopause(struct domain *d)
> > > +#endif
> >
> > #ifndef CONFIG_ARM
> > static
> > #endif
> > void domain_resume_nopause(struct domain *d)
> >
> > to have as little redundancy as possible.
>
> Okay, I’ll change it.
>
> >
> > >  {
> > >      struct vcpu *v;
> > >
> > > -    /*
> > > -     * Some code paths assume that shutdown status does not get reset under
> > > -     * their feet (e.g., some assertions make this assumption).
> > > -     */
> > > -    domain_pause(d);
> > > -
> > > +    domain_lock(d);
> >
> > This addition of locking affects domain_resume() as well. Neither need nor
> > correctness are discussed in the description. If the locking was really
> > needed for domain_resume() as well, I suppose adding that would better be
> > a separate change anyway.
>
> Thanks for the review.
>
> The locking was added to avoid potential races involving _VPF_suspended and
> the suspend/resume logic.
>
> Consider the case where domain_lock() is not used inside domain_resume():
>
> Domain 1 initiates suspend via PSCI SYSTEM_SUSPEND. At the same time,
> Domain 0 invokes resume for Domain 1.
>
> Domain 0 calls xl resume, which leads to domain_resume(). Domain 1 acquires
> domain_lock() as part of the suspend path. Then it acquires the shutdown
> lock in domain_shutdown(). Domain 0 is blocked waiting for the shutdown
> lock. When Domain 1 releases the shutdown lock, it sets _VPF_suspended and
> modifies the VCPU context. Then Domain 0 clears _VPF_suspended.
>
> At this point, ctxt_switch_from() might be called with _VPF_suspended
> already cleared, and the VCPU context partially updated. For example, the
> guest PC is set to the resume entry point, but some registers like TTBR or
> SCTLR_EL1 are saved from the current hardware context by
> ctxt_switch_from.
>
> However, after reviewing the flow again, I think this kind of race could
> still happen even with the lock in place. Imagine Domain 1 sets the flag
> via SYSTEM_SUSPEND, and then Domain 0 clears it by resuming the domain
> before the first context switch. This could still result in a partially
> updated context with inconsistent state.

There are no synchronization issues here -- domain_pause inside domain_resume
prevents them by design.

The only situation where issues might arise is during a SYSTEM_SUSPEND request
for a guest that has multiple vCPUs online (buggy OS), while another
vCPU performs
a CPU_ON request.

Therefore, it seems we only need to protect the loop that checks if other vCPUs
are offline during the SYSTEM_SUSPEND vPSCI call using the domain lock.

>
> So it might be better to update the VCPU context at the point of resume
> instead of doing it during suspend. I'll look into that further and also
> check for other possible races if the update is moved.
>
> >
> > The addition of this locking is particularly interesting considering that
> > ...
> >
> > >      spin_lock(&d->shutdown_lock);
> >
> > ... is what follows right after.
> >
> > > --- a/xen/include/xen/sched.h
> > > +++ b/xen/include/xen/sched.h
> > > @@ -814,6 +814,9 @@ void domain_destroy(struct domain *d);
> > >  int domain_kill(struct domain *d);
> > >  int domain_shutdown(struct domain *d, u8 reason);
> > >  void domain_resume(struct domain *d);
> > > +#ifdef CONFIG_ARM
> > > +void domain_resume_nopause(struct domain *d);
> > > +#endif
> > >
> > >  int domain_soft_reset(struct domain *d, bool resuming);
> > >
> > > @@ -1010,6 +1013,10 @@ static inline struct domain *next_domain_in_cpupool(
> > >  /* VCPU is parked. */
> > >  #define _VPF_parked          8
> > >  #define VPF_parked           (1UL<<_VPF_parked)
> > > +/* VCPU is suspended. */
> > > +#define _VPF_suspended       9
> > > +#define VPF_suspended        (1UL<<_VPF_suspended)
> >
> > Irrespective of the style violations in pre-existing code, can you please
> > not add further violations, by inserting the missing blanks?
>
> Okay
>
> >
> > > +
> > >
> >
> > Please also don't introduce double blank lines.
>
> I'll remove it.
>
> >
> > Jan
>
> Best regards,
> Mykola