[PATCH v2] systemd: Add hooks to stop/start xen-watchdog on suspend/resume

Mykola Kvach posted 1 patch 3 months, 2 weeks ago
Patches applied successfully (tree, apply log)
git fetch https://gitlab.com/xen-project/patchew/xen tags/patchew/b44966513abc729f44795c0d5012e1c5fd106477.1752783296.git.mykola._5Fkvach@epam.com
config/Tools.mk.in                            |  1 +
m4/systemd.m4                                 | 14 ++++++++
tools/hotplug/Linux/systemd/Makefile          |  8 ++++-
.../Linux/systemd/xen-watchdog-sleep.sh       | 34 +++++++++++++++++++
4 files changed, 56 insertions(+), 1 deletion(-)
create mode 100644 tools/hotplug/Linux/systemd/xen-watchdog-sleep.sh
[PATCH v2] systemd: Add hooks to stop/start xen-watchdog on suspend/resume
Posted by Mykola Kvach 3 months, 2 weeks ago
From: Mykola Kvach <mykola_kvach@epam.com>

This patch adds a systemd sleep hook script to stop the xen-watchdog
service before system suspend and start it again after resume.

Stopping the watchdog before a system suspend operation may look unsafe.
Let's imagine the following situation: 'systemctl suspend' does not
interact with the running service at all. In such a case, the Xen
watchdog daemon freezes just before suspend. If this happens, for
example, right before sending a ping, and Xen has not yet marked the
domain as suspended (is_shutting_down), the Xen watchdog timer may
trigger a false alert.

This is an almost impossible situation, because typically:
    ping time = watchdog timeout / 2

and the watchdog timeout is usually set to a relatively large value
(dozens of seconds).

Still, this is more likely with very short watchdog timeouts. It may
happen in the following scenarios:
    * Significant delays occur between freezing Linux tasks and
      triggering the ACPI or PSCI sleep request or handler.
    * Long delays happen inside Xen between the entrance to the sleep
      trigger and the actual forwarding of the sleep request further.

A similar situation may occur on resume with short timeouts. During the
resume operation, Xen restores timers and the domain context. The Xen
watchdog timer also resumes. If it schedules the domain right before the
watchdog timeout expires, and the daemon responsible for pinging is not
yet running, a timeout might occur.

Both scenarios are rare and typically require very small watchdog
timeouts combined with significant delays in Xen or the Linux kernel
during suspend/resume flows.

Conceptually, however, if activating and pinging the Xen watchdog is the
responsibility of the domain and its services, then the domain should
also manage the watchdog service/daemon lifecycle. This is similar to
what is already done by the Xen watchdog driver inside the Linux kernel.

Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
---
Changes in V2:
- drop logging to separate files
- remove checks for xen-watchdog service existence at start of sleep script
- use XEN_RUN_DIR for saving watchdog service state before sleep
- remove loop when installing sleep script for xen-watchdog service
- introduce new configs XEN_SYSTEMD_SLEEP_DIR, SYSTEMD_SLEEP_DIR, and
  with-systemd-sleep
---
 config/Tools.mk.in                            |  1 +
 m4/systemd.m4                                 | 14 ++++++++
 tools/hotplug/Linux/systemd/Makefile          |  8 ++++-
 .../Linux/systemd/xen-watchdog-sleep.sh       | 34 +++++++++++++++++++
 4 files changed, 56 insertions(+), 1 deletion(-)
 create mode 100644 tools/hotplug/Linux/systemd/xen-watchdog-sleep.sh

diff --git a/config/Tools.mk.in b/config/Tools.mk.in
index 463ab75965..e47ac23d11 100644
--- a/config/Tools.mk.in
+++ b/config/Tools.mk.in
@@ -53,6 +53,7 @@ CONFIG_LIBFSIMAGE   := @libfsimage@
 CONFIG_SYSTEMD      := @systemd@
 XEN_SYSTEMD_DIR     := @SYSTEMD_DIR@
 XEN_SYSTEMD_MODULES_LOAD := @SYSTEMD_MODULES_LOAD@
+XEN_SYSTEMD_SLEEP_DIR := @SYSTEMD_SLEEP_DIR@
 CONFIG_9PFS         := @ninepfs@
 
 LINUX_BACKEND_MODULES := @LINUX_BACKEND_MODULES@
diff --git a/m4/systemd.m4 b/m4/systemd.m4
index ab12ea313d..ee684d3391 100644
--- a/m4/systemd.m4
+++ b/m4/systemd.m4
@@ -28,6 +28,12 @@ AC_DEFUN([AX_SYSTEMD_OPTIONS], [
 		[set directory for systemd modules load files [PREFIX/lib/modules-load.d/]]),
 		[SYSTEMD_MODULES_LOAD="$withval"], [SYSTEMD_MODULES_LOAD=""])
 	AC_SUBST(SYSTEMD_MODULES_LOAD)
+
+	AC_ARG_WITH(systemd-sleep,
+		AS_HELP_STRING([--with-systemd-sleep=DIR],
+		[set directory for systemd sleep script files [PREFIX/lib/systemd/system-sleep/]]),
+		[SYSTEMD_SLEEP_DIR="$withval"], [SYSTEMD_SLEEP_DIR=""])
+	AC_SUBST(SYSTEMD_SLEEP_DIR)
 ])
 
 AC_DEFUN([AX_ENABLE_SYSTEMD_OPTS], [
@@ -69,6 +75,14 @@ AC_DEFUN([AX_CHECK_SYSTEMD_LIBS], [
 	AS_IF([test "x$SYSTEMD_MODULES_LOAD" = x], [
 	    AC_MSG_ERROR([SYSTEMD_MODULES_LOAD is unset])
 	], [])
+
+	AS_IF([test "x$SYSTEMD_SLEEP_DIR" = x], [
+	    SYSTEMD_SLEEP_DIR="\$(prefix)/lib/systemd/system-sleep/"
+	], [])
+
+	AS_IF([test "x$SYSTEMD_SLEEP_DIR" = x], [
+	    AC_MSG_ERROR([SYSTEMD_SLEEP_DIR is unset])
+	], [])
 ])
 
 AC_DEFUN([AX_CHECK_SYSTEMD], [
diff --git a/tools/hotplug/Linux/systemd/Makefile b/tools/hotplug/Linux/systemd/Makefile
index e29889156d..579ef9d87d 100644
--- a/tools/hotplug/Linux/systemd/Makefile
+++ b/tools/hotplug/Linux/systemd/Makefile
@@ -5,6 +5,8 @@ XEN_SYSTEMD_MODULES := xen.conf
 
 XEN_SYSTEMD_MOUNT := proc-xen.mount
 
+XEN_SYSTEMD_SLEEP_SCRIPT := xen-watchdog-sleep.sh
+
 XEN_SYSTEMD_SERVICE := xenstored.service
 XEN_SYSTEMD_SERVICE += xenconsoled.service
 XEN_SYSTEMD_SERVICE += xen-qemu-dom0-disk-backend.service
@@ -15,7 +17,8 @@ XEN_SYSTEMD_SERVICE += xendriverdomain.service
 
 ALL_XEN_SYSTEMD :=	$(XEN_SYSTEMD_MODULES)  \
 			$(XEN_SYSTEMD_MOUNT)	\
-			$(XEN_SYSTEMD_SERVICE)
+			$(XEN_SYSTEMD_SERVICE)	\
+			$(XEN_SYSTEMD_SLEEP_SCRIPT)
 
 .PHONY: all
 all:	$(ALL_XEN_SYSTEMD)
@@ -31,15 +34,18 @@ distclean: clean
 install: $(ALL_XEN_SYSTEMD)
 	$(INSTALL_DIR) $(DESTDIR)$(XEN_SYSTEMD_DIR)
 	$(INSTALL_DIR) $(DESTDIR)$(XEN_SYSTEMD_MODULES_LOAD)
+	$(INSTALL_DIR) $(DESTDIR)$(XEN_SYSTEMD_SLEEP_DIR)
 	$(INSTALL_DATA) *.service $(DESTDIR)$(XEN_SYSTEMD_DIR)
 	$(INSTALL_DATA) *.mount $(DESTDIR)$(XEN_SYSTEMD_DIR)
 	$(INSTALL_DATA) *.conf $(DESTDIR)$(XEN_SYSTEMD_MODULES_LOAD)
+	$(INSTALL_PROG) $(XEN_SYSTEMD_SLEEP_SCRIPT) $(DESTDIR)$(XEN_SYSTEMD_SLEEP_DIR)
 
 .PHONY: uninstall
 uninstall:
 	rm -f $(DESTDIR)$(XEN_SYSTEMD_MODULES_LOAD)/*.conf
 	rm -f $(DESTDIR)$(XEN_SYSTEMD_DIR)/*.mount
 	rm -f $(DESTDIR)$(XEN_SYSTEMD_DIR)/*.service
+	rm -f $(DESTDIR)$(XEN_SYSTEMD_SLEEP_DIR)/$(XEN_SYSTEMD_SLEEP_SCRIPT)
 
 $(XEN_SYSTEMD_MODULES):
 	rm -f $@.tmp
diff --git a/tools/hotplug/Linux/systemd/xen-watchdog-sleep.sh b/tools/hotplug/Linux/systemd/xen-watchdog-sleep.sh
new file mode 100644
index 0000000000..e9bdadc8fa
--- /dev/null
+++ b/tools/hotplug/Linux/systemd/xen-watchdog-sleep.sh
@@ -0,0 +1,34 @@
+#!/bin/sh
+
+# The first argument ($1) is:
+#     "pre" or "post"
+# The second argument ($2) is:
+#     "suspend", "hibernate", "hybrid-sleep", or "suspend-then-hibernate"
+
+. /etc/xen/scripts/hotplugpath.sh
+
+SERVICE_NAME="xen-watchdog.service"
+STATE_FILE="${XEN_RUN_DIR}/xen-watchdog-sleep-flag"
+
+case "$1" in
+pre)
+    if systemctl is-active --quiet "${SERVICE_NAME}"; then
+        touch "${STATE_FILE}"
+        echo "Stopping ${SERVICE_NAME} before $2."
+        systemctl stop "${SERVICE_NAME}"
+    fi
+    ;;
+post)
+    if [ -f "${STATE_FILE}" ]; then
+        echo "Starting ${SERVICE_NAME} after $2."
+        systemctl start "${SERVICE_NAME}"
+        rm "${STATE_FILE}"
+    fi
+    ;;
+*)
+    echo "Script called with unknown action '$1'. Arguments: '$@'"
+    exit 1
+    ;;
+esac
+
+exit 0
-- 
2.48.1
Re: [PATCH v2] systemd: Add hooks to stop/start xen-watchdog on suspend/resume
Posted by Andrew Cooper 2 months, 3 weeks ago
On 17/07/2025 9:16 pm, Mykola Kvach wrote:
>  config/Tools.mk.in                            |  1 +
>  m4/systemd.m4                                 | 14 ++++++++
>  tools/hotplug/Linux/systemd/Makefile          |  8 ++++-
>  .../Linux/systemd/xen-watchdog-sleep.sh       | 34 +++++++++++++++++++

This has been committed, but it drops the file:

  /usr/lib/systemd/system-sleep/xen-watchdog-sleep.sh

into a directory which more normally contains:

$ file /usr/lib/systemd/system-sleep/*
/usr/lib/systemd/system-sleep/hdparm:              POSIX shell script, ASCII text executable
/usr/lib/systemd/system-sleep/nvidia:              POSIX shell script, ASCII text executable
/usr/lib/systemd/system-sleep/sysstat.sleep:       POSIX shell script, ASCII text executable
/usr/lib/systemd/system-sleep/tlp:                 POSIX shell script, ASCII text executable
/usr/lib/systemd/system-sleep/unattended-upgrades: POSIX shell script, ASCII text executable


I'd suggest renaming it to xen-watchdog (or perhaps xen-watchdog.sleep).

~Andrew

Re: [PATCH v2] systemd: Add hooks to stop/start xen-watchdog on suspend/resume
Posted by dmkhn@proton.me 3 months ago
On Thu, Jul 17, 2025 at 11:16:58PM +0300, Mykola Kvach wrote:
> From: Mykola Kvach <mykola_kvach@epam.com>
> 
>  $(XEN_SYSTEMD_MODULES):
>  	rm -f $@.tmp
> diff --git a/tools/hotplug/Linux/systemd/xen-watchdog-sleep.sh b/tools/hotplug/Linux/systemd/xen-watchdog-sleep.sh
> new file mode 100644
> index 0000000000..e9bdadc8fa
> --- /dev/null
> +++ b/tools/hotplug/Linux/systemd/xen-watchdog-sleep.sh
> @@ -0,0 +1,34 @@
> +#!/bin/sh

Is there chance to add `set -e` to harden the error path?

With that correction, please consider:

Reviewed-by: Denis Mukhin <dmukhin@ford.com> 
Re: [PATCH v2] systemd: Add hooks to stop/start xen-watchdog on suspend/resume
Posted by Anthony PERARD 3 months, 1 week ago
On Thu, Jul 17, 2025 at 11:16:58PM +0300, Mykola Kvach wrote:
> diff --git a/m4/systemd.m4 b/m4/systemd.m4
> index ab12ea313d..ee684d3391 100644
> --- a/m4/systemd.m4
> +++ b/m4/systemd.m4
> @@ -28,6 +28,12 @@ AC_DEFUN([AX_SYSTEMD_OPTIONS], [
>  		[set directory for systemd modules load files [PREFIX/lib/modules-load.d/]]),
>  		[SYSTEMD_MODULES_LOAD="$withval"], [SYSTEMD_MODULES_LOAD=""])
>  	AC_SUBST(SYSTEMD_MODULES_LOAD)
> +
> +	AC_ARG_WITH(systemd-sleep,
> +		AS_HELP_STRING([--with-systemd-sleep=DIR],
> +		[set directory for systemd sleep script files [PREFIX/lib/systemd/system-sleep/]]),
> +		[SYSTEMD_SLEEP_DIR="$withval"], [SYSTEMD_SLEEP_DIR=""])
> +	AC_SUBST(SYSTEMD_SLEEP_DIR)
>  ])
>  
>  AC_DEFUN([AX_ENABLE_SYSTEMD_OPTS], [
> @@ -69,6 +75,14 @@ AC_DEFUN([AX_CHECK_SYSTEMD_LIBS], [
>  	AS_IF([test "x$SYSTEMD_MODULES_LOAD" = x], [
>  	    AC_MSG_ERROR([SYSTEMD_MODULES_LOAD is unset])
>  	], [])
> +
> +	AS_IF([test "x$SYSTEMD_SLEEP_DIR" = x], [
> +	    SYSTEMD_SLEEP_DIR="\$(prefix)/lib/systemd/system-sleep/"

While reading this change, and systemd.m4, I notice a comment about
using pkg-config, well it's a comment about using it for $SYSTEMD_DIR,
but I believe it applies here too. It looks like we can replace this
hard-coded path by:

    PKG_CHECK_VAR([SYSTEMD_SLEEP_DIR], [systemd], [systemdsleepdir])

Which will query system-sleep path from the system. (This just run
`pkg-config --variable=systemdsleepdir systemd`, and store the result in
SYSTEMD_SLEEP_DIR) (The variable is now named "systemd_sleep_dir", but the
variable name without underscore is still available, and have been
available for longer.)

Is using PKG_CHECK_VAR would be fine with you?

> +	], [])
> +
> +	AS_IF([test "x$SYSTEMD_SLEEP_DIR" = x], [
> +	    AC_MSG_ERROR([SYSTEMD_SLEEP_DIR is unset])
> +	], [])

After changing to use PKG_CHECK_VAR, I think this patch would be good to
go, so: Reviewed-by: Anthony PERARD <anthony.perard@vates.tech>
I can make the change on commit if that's ok.

And I need to remember to run `./autogen.sh` to regen the configure
scripts.

Thanks,

-- 
Anthony PERARD
Re: [PATCH v2] systemd: Add hooks to stop/start xen-watchdog on suspend/resume
Posted by Mykola Kvach 3 months ago
On Fri, Jul 25, 2025 at 5:04 PM Anthony PERARD <anthony@xenproject.org> wrote:
>
> On Thu, Jul 17, 2025 at 11:16:58PM +0300, Mykola Kvach wrote:
> > diff --git a/m4/systemd.m4 b/m4/systemd.m4
> > index ab12ea313d..ee684d3391 100644
> > --- a/m4/systemd.m4
> > +++ b/m4/systemd.m4
> > @@ -28,6 +28,12 @@ AC_DEFUN([AX_SYSTEMD_OPTIONS], [
> >               [set directory for systemd modules load files [PREFIX/lib/modules-load.d/]]),
> >               [SYSTEMD_MODULES_LOAD="$withval"], [SYSTEMD_MODULES_LOAD=""])
> >       AC_SUBST(SYSTEMD_MODULES_LOAD)
> > +
> > +     AC_ARG_WITH(systemd-sleep,
> > +             AS_HELP_STRING([--with-systemd-sleep=DIR],
> > +             [set directory for systemd sleep script files [PREFIX/lib/systemd/system-sleep/]]),
> > +             [SYSTEMD_SLEEP_DIR="$withval"], [SYSTEMD_SLEEP_DIR=""])
> > +     AC_SUBST(SYSTEMD_SLEEP_DIR)
> >  ])
> >
> >  AC_DEFUN([AX_ENABLE_SYSTEMD_OPTS], [
> > @@ -69,6 +75,14 @@ AC_DEFUN([AX_CHECK_SYSTEMD_LIBS], [
> >       AS_IF([test "x$SYSTEMD_MODULES_LOAD" = x], [
> >           AC_MSG_ERROR([SYSTEMD_MODULES_LOAD is unset])
> >       ], [])
> > +
> > +     AS_IF([test "x$SYSTEMD_SLEEP_DIR" = x], [
> > +         SYSTEMD_SLEEP_DIR="\$(prefix)/lib/systemd/system-sleep/"
>
> While reading this change, and systemd.m4, I notice a comment about
> using pkg-config, well it's a comment about using it for $SYSTEMD_DIR,
> but I believe it applies here too. It looks like we can replace this
> hard-coded path by:
>
>     PKG_CHECK_VAR([SYSTEMD_SLEEP_DIR], [systemd], [systemdsleepdir])
>
> Which will query system-sleep path from the system. (This just run
> `pkg-config --variable=systemdsleepdir systemd`, and store the result in
> SYSTEMD_SLEEP_DIR) (The variable is now named "systemd_sleep_dir", but the
> variable name without underscore is still available, and have been
> available for longer.)
>
> Is using PKG_CHECK_VAR would be fine with you?
>
> > +     ], [])
> > +
> > +     AS_IF([test "x$SYSTEMD_SLEEP_DIR" = x], [
> > +         AC_MSG_ERROR([SYSTEMD_SLEEP_DIR is unset])
> > +     ], [])
>
> After changing to use PKG_CHECK_VAR, I think this patch would be good to
> go, so: Reviewed-by: Anthony PERARD <anthony.perard@vates.tech>
> I can make the change on commit if that's ok.

That’s definitely OK, thank you for taking care of it.

>
> And I need to remember to run `./autogen.sh` to regen the configure
> scripts.
>
> Thanks,
>
> --
> Anthony PERARD

Best regards,
Mykola