[RFC PATCH] xen: handle domain_shutdown() return values

Mykola Kvach posted 1 patch 4 days, 5 hours ago
Patches applied successfully (tree, apply log)
git fetch https://gitlab.com/xen-project/patchew/xen tags/patchew/95dacecdce8f8417562548e16a4d3e11c41a3f27.1773923242.git.mykola._5Fkvach@epam.com
xen/arch/arm/vpsci.c    | 10 ++++++++--
xen/arch/x86/compat.c   |  3 +--
xen/arch/x86/hvm/dm.c   |  3 +--
xen/arch/x86/hvm/hvm.c  | 13 +++++++++++--
xen/common/domain.c     |  9 +++++++--
xen/common/sched/core.c |  9 +++++++--
6 files changed, 35 insertions(+), 12 deletions(-)
[RFC PATCH] xen: handle domain_shutdown() return values
Posted by Mykola Kvach 4 days, 5 hours ago
From: Mykola Kvach <mykola_kvach@epam.com>

Propagate domain_shutdown() return codes through the shutdown paths
which can still report errors to their callers, and log explicit
failures in fire-and-forget paths instead of silently discarding the
result.

This makes the shutdown contract explicit for callers which can report
errors, while preserving observable diagnostics for the remaining
fire-and-forget paths.

It also fixes MISRA Dir 4.7 and Rule 17.7 violations by ensuring that
the returned status is tested or otherwise used.

Suggested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
---
Link to discussion: https://patchew.org/Xen/cover.1748848482.git.mykola._5Fkvach@epam.com/7bd75ecfff5b0a75ea5abd7cc4934582d7e1250c.1748848482.git.mykola._5Fkvach@epam.com/#90048f71-8313-4110-924c-f956a2bec5a0@suse.com
---
 xen/arch/arm/vpsci.c    | 10 ++++++++--
 xen/arch/x86/compat.c   |  3 +--
 xen/arch/x86/hvm/dm.c   |  3 +--
 xen/arch/x86/hvm/hvm.c  | 13 +++++++++++--
 xen/common/domain.c     |  9 +++++++--
 xen/common/sched/core.c |  9 +++++++--
 6 files changed, 35 insertions(+), 12 deletions(-)

diff --git a/xen/arch/arm/vpsci.c b/xen/arch/arm/vpsci.c
index 7ba9ccd94b..03b4cb0986 100644
--- a/xen/arch/arm/vpsci.c
+++ b/xen/arch/arm/vpsci.c
@@ -188,13 +188,19 @@ static int32_t do_psci_0_2_migrate_info_type(void)
 static void do_psci_0_2_system_off( void )
 {
     struct domain *d = current->domain;
-    domain_shutdown(d,SHUTDOWN_poweroff);
+    int rc = domain_shutdown(d, SHUTDOWN_poweroff);
+
+    if ( unlikely(rc) )
+        gprintk(XENLOG_ERR, "PSCI SYSTEM_OFF failed: rc=%d\n", rc);
 }
 
 static void do_psci_0_2_system_reset(void)
 {
     struct domain *d = current->domain;
-    domain_shutdown(d,SHUTDOWN_reboot);
+    int rc = domain_shutdown(d, SHUTDOWN_reboot);
+
+    if ( unlikely(rc) )
+        gprintk(XENLOG_ERR, "PSCI SYSTEM_RESET failed: rc=%d\n", rc);
 }
 
 static int32_t do_psci_1_0_features(uint32_t psci_func_id)
diff --git a/xen/arch/x86/compat.c b/xen/arch/x86/compat.c
index 217b5b1fcc..1c203a028f 100644
--- a/xen/arch/x86/compat.c
+++ b/xen/arch/x86/compat.c
@@ -39,8 +39,7 @@ long do_sched_op_compat(int cmd, unsigned long arg)
     case SCHEDOP_shutdown:
         TRACE_TIME(TRC_SCHED_SHUTDOWN,
                    current->domain->domain_id, current->vcpu_id, arg);
-        domain_shutdown(current->domain, (u8)arg);
-        break;
+        return domain_shutdown(current->domain, (u8)arg);
 
     default:
         return -ENOSYS;
diff --git a/xen/arch/x86/hvm/dm.c b/xen/arch/x86/hvm/dm.c
index 3b53471af0..e32338efae 100644
--- a/xen/arch/x86/hvm/dm.c
+++ b/xen/arch/x86/hvm/dm.c
@@ -545,8 +545,7 @@ int dm_op(const struct dmop_args *op_args)
         const struct xen_dm_op_remote_shutdown *data =
             &op.u.remote_shutdown;
 
-        domain_shutdown(d, data->reason);
-        rc = 0;
+        rc = domain_shutdown(d, data->reason);
         break;
     }
 
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 4d37a93c57..d3e5dcc30f 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1728,8 +1728,13 @@ void hvm_vcpu_down(struct vcpu *v)
     /* ... Shut down the domain if not. */
     if ( online_count == 0 )
     {
+        int rc;
+
         gdprintk(XENLOG_INFO, "All CPUs offline -- powering off.\n");
-        domain_shutdown(d, SHUTDOWN_poweroff);
+
+        rc = domain_shutdown(d, SHUTDOWN_poweroff);
+        if ( unlikely(rc) )
+            gdprintk(XENLOG_ERR, "Failed to power off: rc=%d\n", rc);
     }
 }
 
@@ -1758,12 +1763,16 @@ void hvm_triple_fault(void)
     struct vcpu *v = current;
     struct domain *d = v->domain;
     u8 reason = d->arch.hvm.params[HVM_PARAM_TRIPLE_FAULT_REASON];
+    int rc;
 
     gprintk(XENLOG_ERR,
             "Triple fault - invoking HVM shutdown action %d\n",
             reason);
     vcpu_show_execution_state(v);
-    domain_shutdown(d, reason);
+    rc = domain_shutdown(d, reason);
+    if ( unlikely(rc) )
+        gprintk(XENLOG_ERR,
+                "Failed to shut down after triple fault: rc=%d\n", rc);
 }
 
 void hvm_inject_event(const struct x86_event *event)
diff --git a/xen/common/domain.c b/xen/common/domain.c
index ab910fcf93..13198bcca5 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -1355,6 +1355,8 @@ int domain_kill(struct domain *d)
 
 void __domain_crash(struct domain *d)
 {
+    int rc;
+
     if ( d->is_shutting_down )
     {
         /* Print nothing: the domain is already shutting down. */
@@ -1371,7 +1373,10 @@ void __domain_crash(struct domain *d)
                d->domain_id, current->domain->domain_id, smp_processor_id());
     }
 
-    domain_shutdown(d, SHUTDOWN_crash);
+    rc = domain_shutdown(d, SHUTDOWN_crash);
+    if ( unlikely(rc) )
+        printk(XENLOG_ERR
+               "Failed to shut down crashed domain %pd: rc=%d\n", d, rc);
 }
 
 
@@ -2194,7 +2199,7 @@ long common_vcpu_op(int cmd, struct vcpu *v, XEN_GUEST_HANDLE_PARAM(void) arg)
 
         if ( !rc ) /* Last vcpu going down? */
         {
-            domain_shutdown(d, SHUTDOWN_poweroff);
+            rc = domain_shutdown(d, SHUTDOWN_poweroff);
             break;
         }
 
diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
index a57d5dd929..6df631d925 100644
--- a/xen/common/sched/core.c
+++ b/xen/common/sched/core.c
@@ -1537,6 +1537,7 @@ static void cf_check domain_watchdog_timeout(void *data)
      */
     struct domain *d = _p((unsigned long)data & PAGE_MASK);
     unsigned int id = (unsigned long)data & ~PAGE_MASK;
+    int rc;
 
     BUILD_BUG_ON(alignof(*d) < PAGE_SIZE);
 
@@ -1544,7 +1545,11 @@ static void cf_check domain_watchdog_timeout(void *data)
         return;
 
     printk("Watchdog timer %u fired for %pd\n", id, d);
-    domain_shutdown(d, SHUTDOWN_watchdog);
+
+    rc = domain_shutdown(d, SHUTDOWN_watchdog);
+    if ( unlikely(rc) )
+        printk(XENLOG_ERR
+               "Failed to shut down %pd after watchdog expiry: rc=%d\n", d, rc);
 }
 
 static long domain_watchdog(struct domain *d, uint32_t id, uint32_t timeout)
@@ -1977,7 +1982,7 @@ ret_t do_sched_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
 
         ret = xsm_schedop_shutdown(XSM_DM_PRIV, current->domain, d);
         if ( likely(!ret) )
-            domain_shutdown(d, sched_remote_shutdown.reason);
+            ret = domain_shutdown(d, sched_remote_shutdown.reason);
 
         rcu_unlock_domain(d);
 
-- 
2.43.0
Re: [RFC PATCH] xen: handle domain_shutdown() return values
Posted by Jan Beulich 4 days, 4 hours ago
On 19.03.2026 13:42, Mykola Kvach wrote:
> From: Mykola Kvach <mykola_kvach@epam.com>
> 
> Propagate domain_shutdown() return codes through the shutdown paths
> which can still report errors to their callers, and log explicit
> failures in fire-and-forget paths instead of silently discarding the
> result.
> 
> This makes the shutdown contract explicit for callers which can report
> errors, while preserving observable diagnostics for the remaining
> fire-and-forget paths.
> 
> It also fixes MISRA Dir 4.7 and Rule 17.7 violations by ensuring that
> the returned status is tested or otherwise used.
> 
> Suggested-by: Jan Beulich <jbeulich@suse.com>

I don't agree with this. For what you want to do (as per the link below)
this is a prereq, but as an independent change I'm not convinced this is
needed. Once it is grouped with that other change, it's kind of natural,
and hence any Suggested-by: would feel odd.

I'm further unconvinced logging is the right course of action in all of
the cases. Some may want to be assertions instead?

Jan

> Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> ---
> Link to discussion: https://patchew.org/Xen/cover.1748848482.git.mykola._5Fkvach@epam.com/7bd75ecfff5b0a75ea5abd7cc4934582d7e1250c.1748848482.git.mykola._5Fkvach@epam.com/#90048f71-8313-4110-924c-f956a2bec5a0@suse.com
Re: [RFC PATCH] xen: handle domain_shutdown() return values
Posted by Mykola Kvach 4 days, 3 hours ago
Hi Jan,

Thank you for the review.

On Thu, Mar 19, 2026 at 3:44 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 19.03.2026 13:42, Mykola Kvach wrote:
> > From: Mykola Kvach <mykola_kvach@epam.com>
> >
> > Propagate domain_shutdown() return codes through the shutdown paths
> > which can still report errors to their callers, and log explicit
> > failures in fire-and-forget paths instead of silently discarding the
> > result.
> >
> > This makes the shutdown contract explicit for callers which can report
> > errors, while preserving observable diagnostics for the remaining
> > fire-and-forget paths.
> >
> > It also fixes MISRA Dir 4.7 and Rule 17.7 violations by ensuring that
> > the returned status is tested or otherwise used.
> >
> > Suggested-by: Jan Beulich <jbeulich@suse.com>
>
> I don't agree with this. For what you want to do (as per the link below)
> this is a prereq, but as an independent change I'm not convinced this is
> needed. Once it is grouped with that other change, it's kind of natural,
> and hence any Suggested-by: would feel odd.

I see your point, but I'd still prefer to keep this as a standalone change.

It is no longer tied to the suspend/resume work, as the changes adding new
error cases there are gone. What remains is making the existing non-void
domain_shutdown() contract explicit at its call sites.

So from my perspective this patch stands on its own for two reasons:
- it fixes MISRA Dir 4.7 and Rule 17.7 issues by ensuring the returned status
  is tested or propagated;
- it avoids leaving latent bugs behind if domain_shutdown() gains additional
  failure cases in the future, beyond the currently relevant ones.

>
> I'm further unconvinced logging is the right course of action in all of
> the cases. Some may want to be assertions instead?

That said, I agree the handling likely shouldn't be uniform across all
callers. I can revisit the fire-and-forget paths and use assertions where
a non-zero return should be impossible, instead of logging unconditionally.

If I understand you correctly, then without any additional
suspend-related error case being introduced, you don't see enough
value in this as a standalone patch. Is that the right reading?

Best regards,
Mykola

>
> Jan
>
> > Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> > ---
> > Link to discussion: https://patchew.org/Xen/cover.1748848482.git.mykola._5Fkvach@epam.com/7bd75ecfff5b0a75ea5abd7cc4934582d7e1250c.1748848482.git.mykola._5Fkvach@epam.com/#90048f71-8313-4110-924c-f956a2bec5a0@suse.com
Re: [RFC PATCH] xen: handle domain_shutdown() return values
Posted by Jan Beulich 4 days, 2 hours ago
On 19.03.2026 15:35, Mykola Kvach wrote:
> On Thu, Mar 19, 2026 at 3:44 PM Jan Beulich <jbeulich@suse.com> wrote:
>> On 19.03.2026 13:42, Mykola Kvach wrote:
>>> From: Mykola Kvach <mykola_kvach@epam.com>
>>>
>>> Propagate domain_shutdown() return codes through the shutdown paths
>>> which can still report errors to their callers, and log explicit
>>> failures in fire-and-forget paths instead of silently discarding the
>>> result.
>>>
>>> This makes the shutdown contract explicit for callers which can report
>>> errors, while preserving observable diagnostics for the remaining
>>> fire-and-forget paths.
>>>
>>> It also fixes MISRA Dir 4.7 and Rule 17.7 violations by ensuring that
>>> the returned status is tested or otherwise used.
>>>
>>> Suggested-by: Jan Beulich <jbeulich@suse.com>
>>
>> I don't agree with this. For what you want to do (as per the link below)
>> this is a prereq, but as an independent change I'm not convinced this is
>> needed. Once it is grouped with that other change, it's kind of natural,
>> and hence any Suggested-by: would feel odd.
> 
> I see your point, but I'd still prefer to keep this as a standalone change.
> 
> It is no longer tied to the suspend/resume work, as the changes adding new
> error cases there are gone. What remains is making the existing non-void
> domain_shutdown() contract explicit at its call sites.
> 
> So from my perspective this patch stands on its own for two reasons:
> - it fixes MISRA Dir 4.7 and Rule 17.7 issues by ensuring the returned status
>   is tested or propagated;
> - it avoids leaving latent bugs behind if domain_shutdown() gains additional
>   failure cases in the future, beyond the currently relevant ones.
> 
>>
>> I'm further unconvinced logging is the right course of action in all of
>> the cases. Some may want to be assertions instead?
> 
> That said, I agree the handling likely shouldn't be uniform across all
> callers. I can revisit the fire-and-forget paths and use assertions where
> a non-zero return should be impossible, instead of logging unconditionally.
> 
> If I understand you correctly, then without any additional
> suspend-related error case being introduced, you don't see enough
> value in this as a standalone patch. Is that the right reading?

Not entirely sure. Much would depend on what the description of the change
would say. Addressing Misra concerns, even if just latent ones, is a valid
reason to make such changes, for example.

Jan