From nobody Tue Jan 21 07:39:44 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of lists.libvirt.org designates 8.43.85.245 as permitted sender) client-ip=8.43.85.245; envelope-from=devel-bounces@lists.libvirt.org; helo=lists.libvirt.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of lists.libvirt.org designates 8.43.85.245 as permitted sender) smtp.mailfrom=devel-bounces@lists.libvirt.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.libvirt.org (lists.libvirt.org [8.43.85.245]) by mx.zohomail.com with SMTPS id 1736366132579397.5926869262613; Wed, 8 Jan 2025 11:55:32 -0800 (PST) Received: by lists.libvirt.org (Postfix, from userid 996) id A14F11576; Wed, 8 Jan 2025 14:55:31 -0500 (EST) Received: from lists.libvirt.org (localhost [IPv6:::1]) by lists.libvirt.org (Postfix) with ESMTP id 2374314F5; Wed, 8 Jan 2025 14:44:28 -0500 (EST) Received: by lists.libvirt.org (Postfix, from userid 996) id 72440137A; Wed, 8 Jan 2025 14:44:02 -0500 (EST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.libvirt.org (Postfix) with ESMTPS id DCF7A145E for ; Wed, 8 Jan 2025 14:43:34 -0500 (EST) Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-119-B_TFWNptMoGeI8tI53LDiA-1; Wed, 08 Jan 2025 14:43:33 -0500 Received: from mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 9C2A3195609E for ; Wed, 8 Jan 2025 19:43:32 +0000 (UTC) Received: from toolbx.redhat.com (unknown [10.42.28.103]) by mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id AF20D300018D; Wed, 8 Jan 2025 19:43:31 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on lists.libvirt.org X-Spam-Level: X-Spam-Status: No, score=-0.5 required=5.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE, RCVD_IN_VALIDITY_RPBL_BLOCKED,RCVD_IN_VALIDITY_SAFE_BLOCKED, SPF_HELO_NONE autolearn=unavailable autolearn_force=no version=3.4.4 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1736365414; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=13BMsvxHaPqn/D0JnXmEXMtjO5PiJEM69W9JCsO7I0w=; b=hoVToOxEX9bDMwAp+ag3q3xMeprEcqeP/GHoe/RQvEzDpatfCncDrlsYvNcimWZKpY2WxY X69bGE5NoeMc8qouGhsgIQXUjPxx2nwtNVdUHw/KvJ0fvR1xxgWgxTuQBEDPtYiaLiztn/ yqPKvWNoM7naBTT4DJNJQXMxXN4eWis= X-MC-Unique: B_TFWNptMoGeI8tI53LDiA-1 X-Mimecast-MFC-AGG-ID: B_TFWNptMoGeI8tI53LDiA From: =?UTF-8?q?Daniel=20P=2E=20Berrang=C3=A9?= To: devel@lists.libvirt.org Cc: =?UTF-8?q?Daniel=20P=2E=20Berrang=C3=A9?= Subject: [PATCH 25/26] rpc: don't let systemd shutdown daemon while saving VMs Date: Wed, 8 Jan 2025 19:42:58 +0000 Message-ID: <20250108194259.1171990-26-berrange@redhat.com> In-Reply-To: <20250108194259.1171990-1-berrange@redhat.com> References: <20250108194259.1171990-1-berrange@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.4 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: vgjQ5TZDvEu2-cC2Haame1CCQM2U7QDbewHUmCCLNwA_1736365412 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Message-ID-Hash: Y76KJ335L2NUA2AMYD7ETDJGUSJKB46E X-Message-ID-Hash: Y76KJ335L2NUA2AMYD7ETDJGUSJKB46E X-MailFrom: berrange@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-config-1; header-match-config-2; header-match-config-3; header-match-devel.lists.libvirt.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; suspicious-header X-Mailman-Version: 3.2.2 Precedence: list List-Id: Development discussions about the libvirt library & tools Archived-At: List-Archive: List-Help: List-Post: List-Subscribe: List-Unsubscribe: X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1736366134551116600 Content-Type: text/plain; charset="utf-8" The service unit "TimeoutStopSec" setting controls how long systemd waits for a service to stop before aggressively killing it, defaulting to 30 seconds if not set. When we're processing shutdown of VMs in response to OS shutdown, we very likely need more than 30 seconds to complete this job, and can not stop the daemon during this time. To avoid being prematurely killed, setup a timer that repeatedly extends the "TimeoutStopSec" value while stop of running VMs is arranged. This does mean if libvirt hangs while stoppping VMs, systemd won't get to kill the libvirt daemon, but this is considered less harmful that forcefully killing running VMs. Signed-off-by: Daniel P. Berrang=C3=A9 --- src/rpc/virnetdaemon.c | 62 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 62 insertions(+) diff --git a/src/rpc/virnetdaemon.c b/src/rpc/virnetdaemon.c index 8cc7af1182..3ddb9b5404 100644 --- a/src/rpc/virnetdaemon.c +++ b/src/rpc/virnetdaemon.c @@ -78,6 +78,9 @@ struct _virNetDaemon { virNetDaemonShutdownCallback shutdownPrepareCb; virNetDaemonShutdownCallback shutdownWaitCb; virThread *shutdownPreserveThread; + unsigned long long preserveStart; + unsigned int preserveExtended; + int preserveTimer; int quitTimer; virNetDaemonQuitPhase quit; bool graceful; @@ -93,6 +96,14 @@ struct _virNetDaemon { =20 static virClass *virNetDaemonClass; =20 +/* + * The minimum additional shutdown time (secs) we should ask + * systemd to allow, while state preservation operations + * are running. A timer will run every 5 seconds, and + * ensure at least this much extra time is requested + */ +#define VIR_NET_DAEMON_PRESERVE_MIN_TIME 30 + static int daemonServerClose(void *payload, const char *key G_GNUC_UNUSED, @@ -162,6 +173,7 @@ virNetDaemonNew(void) if (virEventRegisterDefaultImpl() < 0) goto error; =20 + dmn->preserveTimer =3D -1; dmn->autoShutdownTimerID =3D -1; =20 #ifndef WIN32 @@ -727,6 +739,42 @@ daemonShutdownWait(void *opaque) } } =20 +static void +virNetDaemonPreserveTimer(int timerid G_GNUC_UNUSED, + void *opaque) +{ + virNetDaemon *dmn =3D opaque; + VIR_LOCK_GUARD lock =3D virObjectLockGuard(dmn); + unsigned long long now =3D g_get_monotonic_time(); + unsigned long long delta; + + if (dmn->quit !=3D VIR_NET_DAEMON_QUIT_PRESERVING) + return; + + VIR_DEBUG("Started at %llu now %llu extended %u", + dmn->preserveStart, now, dmn->preserveExtended); + + /* Time since start of preserving state in usec */ + delta =3D now - dmn->preserveStart; + /* Converts to secs */ + delta /=3D (1000ull * 1000ull); + + /* Want extra seconds grace to ensure this timer fires + * again before system timeout expires, under high + * load conditions */ + delta +=3D VIR_NET_DAEMON_PRESERVE_MIN_TIME; + + /* Deduct any extension we've previously asked for */ + delta -=3D dmn->preserveExtended; + + /* Tell systemd how much more we need to extend by */ + virSystemdNotifyExtendTimeout(delta); + dmn->preserveExtended +=3D delta; + + VIR_DEBUG("Extended by %llu", delta); +} + + static void virNetDaemonQuitTimer(int timerid G_GNUC_UNUSED, void *opaque) @@ -781,11 +829,21 @@ virNetDaemonRun(virNetDaemon *dmn) =20 if (dmn->quit =3D=3D VIR_NET_DAEMON_QUIT_REQUESTED) { VIR_DEBUG("Process quit request"); + virSystemdNotifyStopping(); virHashForEach(dmn->servers, daemonServerClose, NULL); =20 if (dmn->shutdownPreserveThread) { VIR_DEBUG("Shutdown preserve thread running"); dmn->quit =3D VIR_NET_DAEMON_QUIT_PRESERVING; + dmn->preserveStart =3D g_get_monotonic_time(); + dmn->preserveExtended =3D VIR_NET_DAEMON_PRESERVE_MIN_TIME; + virSystemdNotifyExtendTimeout(dmn->preserveExtended); + if ((dmn->preserveTimer =3D virEventAddTimeout(5 * 1000, + virNetDaemonP= reserveTimer, + dmn, NULL)) <= 0) { + VIR_WARN("Failed to register preservation timer"); + /* hope for the best */ + } } else { VIR_DEBUG("Ready to shutdown"); dmn->quit =3D VIR_NET_DAEMON_QUIT_READY; @@ -866,6 +924,10 @@ static void virNetDaemonPreserveWorker(void *opaque) dmn->quit =3D VIR_NET_DAEMON_QUIT_READY; } g_clear_pointer(&dmn->shutdownPreserveThread, g_free); + if (dmn->preserveTimer !=3D -1) { + virEventRemoveTimeout(dmn->preserveTimer); + dmn->preserveTimer =3D -1; + } } =20 VIR_DEBUG("End preserve dmn=3D%p", dmn); --=20 2.47.1