From nobody Sun Dec 22 03:13:15 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of lists.libvirt.org designates 8.43.85.245 as permitted sender) client-ip=8.43.85.245; envelope-from=devel-bounces@lists.libvirt.org; helo=lists.libvirt.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of lists.libvirt.org designates 8.43.85.245 as permitted sender) smtp.mailfrom=devel-bounces@lists.libvirt.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.libvirt.org (lists.libvirt.org [8.43.85.245]) by mx.zohomail.com with SMTPS id 1733502355104469.10622593959465; Fri, 6 Dec 2024 08:25:55 -0800 (PST) Received: by lists.libvirt.org (Postfix, from userid 996) id 1D4161393; Fri, 6 Dec 2024 11:25:54 -0500 (EST) Received: from lists.libvirt.org (localhost [IPv6:::1]) by lists.libvirt.org (Postfix) with ESMTP id 5A6211365; Fri, 6 Dec 2024 11:25:33 -0500 (EST) Received: by lists.libvirt.org (Postfix, from userid 996) id 866C2134D; Fri, 6 Dec 2024 11:25:30 -0500 (EST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.libvirt.org (Postfix) with ESMTPS id 9268312F0 for ; Fri, 6 Dec 2024 11:25:29 -0500 (EST) Received: from mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-608-IKdwiMm3MFmbnLgXcSYZWQ-1; Fri, 06 Dec 2024 11:25:28 -0500 Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 25C9D1955F65 for ; Fri, 6 Dec 2024 16:25:27 +0000 (UTC) Received: from orkuz (unknown [10.45.224.21]) by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 70D8B1955F3E for ; Fri, 6 Dec 2024 16:25:25 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on lists.libvirt.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,RCVD_IN_VALIDITY_RPBL_BLOCKED, RCVD_IN_VALIDITY_SAFE_BLOCKED,SPF_HELO_NONE autolearn=unavailable autolearn_force=no version=3.4.4 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1733502329; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=CWApnc2ziAH9E3lm0tw8pGy6qylADCXQPCKcZYhPDKU=; b=fDM5AyPz9NAbMC0IC6i9Pe+mIMXkn1+3sqn8M+8c0dCyl/INmdwPQZfu+m8Oe+0yEnCGRs 87iWVrBU/sfwtbHhAXT+R0yZ9QYOQJZtJ9C0sxXkQdSrHt3kohzjfetJJxrg61xsOM93co WlFx7sABdPaVy3vEPllGZDmYUsb2gaE= X-MC-Unique: IKdwiMm3MFmbnLgXcSYZWQ-1 X-Mimecast-MFC-AGG-ID: IKdwiMm3MFmbnLgXcSYZWQ From: Jiri Denemark To: devel@lists.libvirt.org Subject: [PATCH] virsh: Fix --timeout option of migrate command Date: Fri, 6 Dec 2024 17:25:11 +0100 Message-ID: <6f1f0edbe9cc5c0d36656b4a1b6b86f87ebdc1b1.1733502311.git.jdenemar@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: Lu-ROC3jJqeE5AT3TEoGVzprWm3LiL87Y1DWtS0qndQ_1733502327 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Message-ID-Hash: JRJEHOZWH5N5DOJDZPK37RTVS6ACQQ4N X-Message-ID-Hash: JRJEHOZWH5N5DOJDZPK37RTVS6ACQQ4N X-MailFrom: jdenemar@redhat.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-config-1; header-match-config-2; header-match-config-3; header-match-devel.lists.libvirt.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; suspicious-header X-Mailman-Version: 3.2.2 Precedence: list List-Id: Development discussions about the libvirt library & tools Archived-At: List-Archive: List-Help: List-Post: List-Subscribe: List-Unsubscribe: X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1733502356785116600 Content-Type: text/plain; charset="utf-8" When starting a migration with --timeout, we create a thread to call the migration API and in parallel setup a timer for the timeout. The description of --timeout says: "run action specified by --timeout-* option (suspend by default) if live migration exceeds timeout", which is not really the way this feature was implemented. Before live migration starts we first need to contact the source to get the domain definition and send it to the destination where a new QEMU process has to be started. This can take some (unpredictably long) time while the timeout timer is already running. If a very short timeout is set (which doesn't really make sense, but it's allowed), we may even end up taking the timeout action before the actual migration had a chance to start. With this patch the timeout is started only after we get non-zero dataTotal from virDomainGetJobInfo, which means the migration (of either storage or memory) really started. https://issues.redhat.com/browse/RHEL-41264 Signed-off-by: Jiri Denemark Reviewed-by: Michal Privoznik --- tools/virsh-domain.c | 55 ++++++++++++++++++++++++++++---------------- 1 file changed, 35 insertions(+), 20 deletions(-) diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c index e4923284af..546db955a9 100644 --- a/tools/virsh-domain.c +++ b/tools/virsh-domain.c @@ -4237,7 +4237,10 @@ typedef void (*jobWatchTimeoutFunc)(vshControl *ctl,= virDomainPtr dom, struct virshWatchData { vshControl *ctl; virDomainPtr dom; + GMainContext *context; jobWatchTimeoutFunc timeout_func; + int timeout_secs; + GSource *timeout_src; void *opaque; const char *label; GIOChannel *stdin_ioc; @@ -4259,6 +4262,20 @@ virshWatchTimeout(gpointer opaque) } =20 =20 +static void +virshWatchSetTimeout(struct virshWatchData *data) +{ + vshDebug(data->ctl, VSH_ERR_DEBUG, + "watchJob: setting timeout of %d secs\n", data->timeout_secs); + + data->timeout_src =3D g_timeout_source_new_seconds(data->timeout_secs); + g_source_set_callback(data->timeout_src, + virshWatchTimeout, + data, NULL); + g_source_attach(data->timeout_src, data->context); +} + + static gboolean virshWatchProgress(gpointer opaque) { @@ -4290,10 +4307,17 @@ virshWatchProgress(gpointer opaque) jobinfo.type =3D=3D VIR_DOMAIN_JOB_UNBOUNDED)) { vshTTYDisableInterrupt(data->ctl); data->jobStarted =3D true; + vshDebug(data->ctl, VSH_ERR_DEBUG, + "watchJob: job started\n"); + } =20 - if (!data->verbose) { + if (data->jobStarted) { + if (data->timeout_secs > 0 && !data->timeout_src) { + if (jobinfo.dataTotal > 0) + virshWatchSetTimeout(data); + } else if (!data->verbose) { vshDebug(data->ctl, VSH_ERR_DEBUG, - "watchJob: job started, disabling callback\n"); + "watchJob: disabling callback\n"); return G_SOURCE_REMOVE; } } @@ -4356,13 +4380,15 @@ virshWatchJob(vshControl *ctl, struct sigaction sig_action; struct sigaction old_sig_action; #endif /* !WIN32 */ - g_autoptr(GSource) timeout_src =3D NULL; g_autoptr(GSource) progress_src =3D NULL; g_autoptr(GSource) stdin_src =3D NULL; struct virshWatchData data =3D { .ctl =3D ctl, .dom =3D dom, + .context =3D g_main_loop_get_context(eventLoop), .timeout_func =3D timeout_func, + .timeout_secs =3D timeout_secs, + .timeout_src =3D NULL, .opaque =3D opaque, .label =3D label, .stdin_ioc =3D NULL, @@ -4391,27 +4417,14 @@ virshWatchJob(vshControl *ctl, g_source_set_callback(stdin_src, (GSourceFunc)virshWatchInterrupt, &data, NULL); - g_source_attach(stdin_src, - g_main_loop_get_context(eventLoop)); - } - - if (timeout_secs) { - vshDebug(ctl, VSH_ERR_DEBUG, - "watchJob: setting timeout of %d secs\n", timeout_secs); - timeout_src =3D g_timeout_source_new_seconds(timeout_secs); - g_source_set_callback(timeout_src, - virshWatchTimeout, - &data, NULL); - g_source_attach(timeout_src, - g_main_loop_get_context(eventLoop)); + g_source_attach(stdin_src, data.context); } =20 progress_src =3D g_timeout_source_new(500); g_source_set_callback(progress_src, virshWatchProgress, &data, NULL); - g_source_attach(progress_src, - g_main_loop_get_context(eventLoop)); + g_source_attach(progress_src, data.context); =20 g_main_loop_run(eventLoop); =20 @@ -4420,8 +4433,10 @@ virshWatchJob(vshControl *ctl, if (*job_err =3D=3D 0 && verbose) /* print [100 %] */ virshPrintJobProgress(label, 0, 1); =20 - if (timeout_src) - g_source_destroy(timeout_src); + if (data.timeout_src) { + g_source_destroy(data.timeout_src); + g_source_unref(data.timeout_src); + } g_source_destroy(progress_src); if (stdin_src) g_source_destroy(stdin_src); --=20 2.47.0