From nobody Mon Dec 15 09:42:41 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of lists.libvirt.org designates 8.43.85.245 as permitted sender) client-ip=8.43.85.245; envelope-from=devel-bounces@lists.libvirt.org; helo=lists.libvirt.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of lists.libvirt.org designates 8.43.85.245 as permitted sender) smtp.mailfrom=devel-bounces@lists.libvirt.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.libvirt.org (lists.libvirt.org [8.43.85.245]) by mx.zohomail.com with SMTPS id 1741800628031866.1693590251471; Wed, 12 Mar 2025 10:30:28 -0700 (PDT) Received: by lists.libvirt.org (Postfix, from userid 996) id 640931B25; Wed, 12 Mar 2025 13:30:27 -0400 (EDT) Received: from lists.libvirt.org (localhost [IPv6:::1]) by lists.libvirt.org (Postfix) with ESMTP id 9E3151F80; Wed, 12 Mar 2025 13:26:48 -0400 (EDT) Received: by lists.libvirt.org (Postfix, from userid 996) id 1B01F1C58; Wed, 12 Mar 2025 13:26:37 -0400 (EDT) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.libvirt.org (Postfix) with ESMTPS id 8F7791EA5 for ; Wed, 12 Mar 2025 13:26:03 -0400 (EDT) Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-458-grQ9pbwYNfqeRr0BKRh-pg-1; Wed, 12 Mar 2025 13:26:02 -0400 Received: from mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.111]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 844B61910E8C for ; Wed, 12 Mar 2025 17:18:38 +0000 (UTC) Received: from toolbx.redhat.com (unknown [10.42.28.57]) by mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 9264218001DE; Wed, 12 Mar 2025 17:18:37 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on lists.libvirt.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,RCVD_IN_VALIDITY_RPBL_BLOCKED, RCVD_IN_VALIDITY_SAFE_BLOCKED,SPF_HELO_NONE autolearn=unavailable autolearn_force=no version=3.4.4 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741800363; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NdJOuBmX9SguRlV4ySt31f7a0lx6/58n/a/Hu8KfqZA=; b=eVUZrmOOtHfVyM5XXt+0FUZ3h1DtgAZ0SbCPea4v14QP91Dm+ZnZF+lgsSMolcJkxT7VoA 1OgM6bC5PJtlkxCV00lUyvjUI3AdRIQqHLN1C59OyuN+nubG7hj7P24pLf7syZ7mgBBkHO STPbL6a51lrlojIKl6CZ3JoenglDjKg= X-MC-Unique: grQ9pbwYNfqeRr0BKRh-pg-1 X-Mimecast-MFC-AGG-ID: grQ9pbwYNfqeRr0BKRh-pg_1741800361 From: =?UTF-8?q?Daniel=20P=2E=20Berrang=C3=A9?= To: devel@lists.libvirt.org Cc: =?UTF-8?q?Daniel=20P=2E=20Berrang=C3=A9?= Subject: [PATCH v2 21/22] rpc: don't let systemd shutdown daemon while saving VMs Date: Wed, 12 Mar 2025 17:18:01 +0000 Message-ID: <20250312171802.1854985-22-berrange@redhat.com> In-Reply-To: <20250312171802.1854985-1-berrange@redhat.com> References: <20250312171802.1854985-1-berrange@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.111 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: 8Am3EZoN6_oszosrVRY2Q99KUeSjoUtHxg3-SXwE6yg_1741800361 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Message-ID-Hash: I264QJU2F472WRPCQNSZ7NCRMJINWW3C X-Message-ID-Hash: I264QJU2F472WRPCQNSZ7NCRMJINWW3C X-MailFrom: berrange@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-config-1; header-match-config-2; header-match-config-3; header-match-devel.lists.libvirt.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; suspicious-header X-Mailman-Version: 3.2.2 Precedence: list List-Id: Development discussions about the libvirt library & tools Archived-At: List-Archive: List-Help: List-Post: List-Subscribe: List-Unsubscribe: X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1741800628830019000 Content-Type: text/plain; charset="utf-8" The service unit "TimeoutStopSec" setting controls how long systemd waits for a service to stop before aggressively killing it, defaulting to 30 seconds if not set. When we're processing shutdown of VMs in response to OS shutdown, we very likely need more than 30 seconds to complete this job, and can not stop the daemon during this time. To avoid being prematurely killed, setup a timer that repeatedly extends the "TimeoutStopSec" value while stop of running VMs is arranged. This does mean if libvirt hangs while stoppping VMs, systemd won't get to kill the libvirt daemon, but this is considered less harmful that forcefully killing running VMs. Signed-off-by: Daniel P. Berrang=C3=A9 Reviewed-by: Peter Krempa --- src/rpc/virnetdaemon.c | 53 +++++++++++++++++++++++++++++++++++++++++- 1 file changed, 52 insertions(+), 1 deletion(-) diff --git a/src/rpc/virnetdaemon.c b/src/rpc/virnetdaemon.c index 53dee60703..944a832ea8 100644 --- a/src/rpc/virnetdaemon.c +++ b/src/rpc/virnetdaemon.c @@ -84,6 +84,7 @@ struct _virNetDaemon { virNetDaemonLifecycleCallback shutdownPrepareCb; virNetDaemonLifecycleCallback shutdownWaitCb; virThread *stateStopThread; + int stopTimer; int quitTimer; virNetDaemonQuitPhase quit; bool graceful; @@ -99,6 +100,25 @@ struct _virNetDaemon { =20 static virClass *virNetDaemonClass; =20 +/* + * When running state stop operation which can be slow... + * + * How frequently we tell systemd to extend our stop time, + * and how much we ask for each time. The latter should + * exceed the former with a decent tolerance for high load + * scenarios + */ +#define VIR_NET_DAEMON_STOP_EXTEND_INTERVAL_MSEC (5 * 1000) +#define VIR_NET_DAEMON_STOP_EXTRA_TIME_SEC 10 + +/* + * When running daemon shutdown synchronization which + * ought to be moderately fast + */ +#define VIR_NET_DAEMON_SHUTDOWN_TIMEOUT_SEC 30 +#define VIR_NET_DAEMON_SHUTDOWN_TIMEOUT_MSEC (VIR_NET_DAEMON_SHUTDOWN_TIME= OUT_SEC * 1000) + + static int daemonServerClose(void *payload, const char *key G_GNUC_UNUSED, @@ -168,6 +188,7 @@ virNetDaemonNew(void) if (virEventRegisterDefaultImpl() < 0) goto error; =20 + dmn->stopTimer =3D -1; dmn->autoShutdownTimerID =3D -1; =20 #ifndef WIN32 @@ -737,6 +758,23 @@ daemonShutdownWait(void *opaque) } } =20 +static void +virNetDaemonStopTimer(int timerid G_GNUC_UNUSED, + void *opaque) +{ + virNetDaemon *dmn =3D opaque; + VIR_LOCK_GUARD lock =3D virObjectLockGuard(dmn); + + if (dmn->quit !=3D VIR_NET_DAEMON_QUIT_STOPPING) + return; + + VIR_DEBUG("Extending stop timeout %u", + VIR_NET_DAEMON_STOP_EXTRA_TIME_SEC); + + virSystemdNotifyExtendTimeout(VIR_NET_DAEMON_STOP_EXTRA_TIME_SEC); +} + + static void virNetDaemonQuitTimer(int timerid G_GNUC_UNUSED, void *opaque) @@ -791,11 +829,19 @@ virNetDaemonRun(virNetDaemon *dmn) =20 if (dmn->quit =3D=3D VIR_NET_DAEMON_QUIT_REQUESTED) { VIR_DEBUG("Process quit request"); + virSystemdNotifyStopping(); virHashForEach(dmn->servers, daemonServerClose, NULL); =20 if (dmn->stateStopThread) { VIR_DEBUG("State stop thread running"); dmn->quit =3D VIR_NET_DAEMON_QUIT_STOPPING; + virSystemdNotifyExtendTimeout(VIR_NET_DAEMON_STOP_EXTRA_TI= ME_SEC); + if ((dmn->stopTimer =3D virEventAddTimeout(VIR_NET_DAEMON_= STOP_EXTEND_INTERVAL_MSEC, + virNetDaemonStopT= imer, + dmn, NULL)) < 0) { + VIR_WARN("Failed to register stop timer"); + /* hope for the best */ + } } else { VIR_DEBUG("Ready to shutdown"); dmn->quit =3D VIR_NET_DAEMON_QUIT_READY; @@ -807,7 +853,8 @@ virNetDaemonRun(virNetDaemon *dmn) if (dmn->shutdownPrepareCb && dmn->shutdownPrepareCb() < 0) break; =20 - if ((dmn->quitTimer =3D virEventAddTimeout(30 * 1000, + virSystemdNotifyExtendTimeout(VIR_NET_DAEMON_SHUTDOWN_TIMEOUT_= SEC); + if ((dmn->quitTimer =3D virEventAddTimeout(VIR_NET_DAEMON_SHUT= DOWN_TIMEOUT_MSEC, virNetDaemonQuitTimer, dmn, NULL)) < 0) { VIR_WARN("Failed to register finish timer."); @@ -879,6 +926,10 @@ virNetDaemonStopWorker(void *opaque) dmn->quit =3D VIR_NET_DAEMON_QUIT_READY; } g_clear_pointer(&dmn->stateStopThread, g_free); + if (dmn->stopTimer !=3D -1) { + virEventRemoveTimeout(dmn->stopTimer); + dmn->stopTimer =3D -1; + } } =20 VIR_DEBUG("End stop dmn=3D%p", dmn); --=20 2.48.1