From nobody Mon Feb 9 21:21:17 2026 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of redhat.com designates 170.10.133.124 as permitted sender) client-ip=170.10.133.124; envelope-from=libvir-list-bounces@redhat.com; helo=us-smtp-delivery-124.mimecast.com; Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=libvir-list-bounces@redhat.com; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1666303171; cv=none; d=zohomail.com; s=zohoarc; b=B3/5k+odFxN9XTEI9WKQPbW4DwJ9c6P1oCkvt/ffjrN62zsOYFexMFjfTNJFhXO2tkuDSbAs2lKzTapzI7Zbt/8bcc4jp11RCIFftL3+8aVss1OZoeMf5WYi8ZIOFwYg1ZcJj1MnpXNSUHraFDHm+3e34zehBXL8mB0+244pyb4= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1666303171; h=Content-Type:Content-Transfer-Encoding:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=19L3m+z2u9t+do6fvjkkYVTxsFmokBradwsAjqUYjvk=; b=TTiwFZ0XIwv2PjghRHYvODekwM6ecIHu4zg1B5ZpKfhYKaXMZW/zYPl3aPxnxG0aIz13iHwgZgtK4u0jL6WIG9Rh4M3Jiy358zn+/ny21YJ9/BuNSi3GhQFTRQrgvjfpTqjjSFk/EGWhBUMQiFZhBTuv4+Z6gSDVE3tHg+mwMZg= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=libvir-list-bounces@redhat.com; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mx.zohomail.com with SMTPS id 166630317117551.349172317906664; Thu, 20 Oct 2022 14:59:31 -0700 (PDT) Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-530-XWhfDA73MZ6v5SxUinBNmw-1; Thu, 20 Oct 2022 17:59:27 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id F3BC73C106B3; Thu, 20 Oct 2022 21:59:23 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (unknown [10.30.29.100]) by smtp.corp.redhat.com (Postfix) with ESMTP id D963317593; Thu, 20 Oct 2022 21:59:23 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (localhost [IPv6:::1]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id B3B6D19465B2; Thu, 20 Oct 2022 21:59:23 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id 3135B1946A61 for ; Thu, 20 Oct 2022 21:59:17 +0000 (UTC) Received: by smtp.corp.redhat.com (Postfix) id 16FA24047AA; Thu, 20 Oct 2022 21:59:17 +0000 (UTC) Received: from himantopus.redhat.com (unknown [10.22.18.111]) by smtp.corp.redhat.com (Postfix) with ESMTPS id E96494047A7 for ; Thu, 20 Oct 2022 21:59:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1666303169; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post; bh=19L3m+z2u9t+do6fvjkkYVTxsFmokBradwsAjqUYjvk=; b=TdMwxuUwtgl0R505RZ9ULsMq2PU4IUYLr6jH27yYAOJyZmot+BHj+vM1CHjeddcp77jOOl bdBajEQVVSGpN2H78VziwUr9Yyj8o0jJUwrtID0T04mMw8QyNWFOvWroLRkYhiapmCrosy g0UzGjzVZkp1aqqv903BAEKZdwJg9y8= X-MC-Unique: XWhfDA73MZ6v5SxUinBNmw-1 X-Original-To: libvir-list@listman.corp.redhat.com From: Jonathon Jongsma To: libvir-list@redhat.com Subject: [libvirt PATCH v3 18/18] qemu: Monitor nbdkit process for exit Date: Thu, 20 Oct 2022 16:59:09 -0500 Message-Id: <20221020215909.1751428-19-jjongsma@redhat.com> In-Reply-To: <20221020215909.1751428-1-jjongsma@redhat.com> References: <20221020215909.1751428-1-jjongsma@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.10 X-BeenThere: libvir-list@redhat.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Development discussions about the libvirt library & tools List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libvir-list-bounces@redhat.com Sender: "libvir-list" X-Scanned-By: MIMEDefang 3.1 on 10.11.54.5 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1666303172398100008 Content-Type: text/plain; charset="utf-8"; x-default="true" Adds the ability to monitor the nbdkit process so that we can take action in case the child exits unexpectedly. When the nbdkit process exits, we pause the vm, restart nbdkit, and then resume the vm. This allows the vm to continue working in the event of a nbdkit failure. Eventually we may want to generalize this functionality since we may need something similar for e.g. qemu-storage-daemon, etc. The process is monitored with the pidfd_open() syscall if it exists (since linux 5.3). Otherwise it resorts to checking whether the process is alive once a second. The one-second time period was chosen somewhat arbitrarily. Signed-off-by: Jonathon Jongsma --- meson.build | 3 + src/qemu/qemu_nbdkit.c | 220 ++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_nbdkit.h | 10 ++ src/qemu/qemu_process.c | 13 +++ 4 files changed, 246 insertions(+) diff --git a/meson.build b/meson.build index e4581e74dd..b4ed170ca1 100644 --- a/meson.build +++ b/meson.build @@ -686,6 +686,9 @@ if host_machine.system() =3D=3D 'linux' # Check if we have new enough kernel to support BPF devices for cgroup= s v2 [ 'linux/bpf.h', 'BPF_PROG_QUERY' ], [ 'linux/bpf.h', 'BPF_CGROUP_DEVICE' ], + + # process management + [ 'sys/syscall.h', 'SYS_pidfd_open' ], ] endif =20 diff --git a/src/qemu/qemu_nbdkit.c b/src/qemu/qemu_nbdkit.c index 0a0dc5d2a4..f17fe022ec 100644 --- a/src/qemu/qemu_nbdkit.c +++ b/src/qemu/qemu_nbdkit.c @@ -21,9 +21,11 @@ =20 #include #include +#include =20 #include "vircommand.h" #include "virerror.h" +#include "virevent.h" #include "virlog.h" #include "virpidfile.h" #include "virtime.h" @@ -36,6 +38,7 @@ #include "qemu_nbdkit.h" #define LIBVIRT_QEMU_NBDKITPRIV_H_ALLOW #include "qemu_nbdkitpriv.h" +#include "qemu_process.h" #include "qemu_security.h" =20 #include @@ -72,6 +75,13 @@ struct _qemuNbdkitCaps { G_DEFINE_TYPE(qemuNbdkitCaps, qemu_nbdkit_caps, G_TYPE_OBJECT); =20 =20 +struct _qemuNbdkitProcessPrivate { + int monitor; + virQEMUDriver *driver; + virDomainObj *vm; +}; + + enum { PIPE_FD_READ =3D 0, PIPE_FD_WRITE =3D 1 @@ -588,6 +598,168 @@ qemuNbdkitCapsCacheNew(const char *cachedir) } =20 =20 +static int +qemuNbdkitProcessStartMonitor(qemuNbdkitProcess *proc, + virDomainObj *vm, + virQEMUDriver *driver); + + +static void +qemuNbdkitProcessHandleExit(qemuNbdkitProcess *proc) +{ + qemuNbdkitProcessPrivate *priv =3D proc->priv; + bool was_running =3D false; + + VIR_DEBUG("nbdkit process %i died", proc->pid); + + /* clean up resources associated with process */ + qemuNbdkitProcessStop(proc); + + if (!(priv->vm && priv->driver)) { + VIR_WARN("Unable to restart nbdkit -- vm and driver not set"); + return; + } + + VIR_DEBUG("restarting nbdkit process"); + + virObjectLock(priv->vm); + if (virDomainObjBeginJob(priv->vm, VIR_JOB_SUSPEND) < 0) { + VIR_WARN("can't begin job"); + goto cleanup; + } + + /* Pause domain */ + if (virDomainObjGetState(priv->vm, NULL) =3D=3D VIR_DOMAIN_RUNNING) { + was_running =3D true; + if (qemuProcessStopCPUs(priv->driver, priv->vm, + VIR_DOMAIN_PAUSED_IOERROR, + VIR_ASYNC_JOB_NONE) < 0) + goto endjob; + VIR_DEBUG("Paused vm while we restart nbdkit backend"); + } + + if (qemuNbdkitProcessStart(proc, priv->vm, priv->driver) < 0) + VIR_WARN("Unable to restart nbkdit process"); + + if (was_running && virDomainObjIsActive(priv->vm)) { + if (qemuProcessStartCPUs(priv->driver, priv->vm, + VIR_DOMAIN_RUNNING_UNPAUSED, + VIR_ASYNC_JOB_NONE) < 0) { + VIR_WARN("Unable to resume guest CPUs after nbdkit restart"); + goto endjob; + } + VIR_DEBUG("Resumed vm"); + } + qemuNbdkitProcessStartMonitor(proc, NULL, NULL); + + endjob: + virDomainObjEndJob(priv->vm); + cleanup: + virObjectUnlock(priv->vm); +} + + +#if WITH_DECL_SYS_PIDFD_OPEN +static void +qemuNbdkitProcessPidfdCb(int watch G_GNUC_UNUSED, + int fd, + int events G_GNUC_UNUSED, + void *opaque) +{ + qemuNbdkitProcess *proc =3D opaque; + + VIR_FORCE_CLOSE(fd); + qemuNbdkitProcessHandleExit(proc); +} +#else +static void +qemuNbdkitProcessTimeoutCb(int timer G_GNUC_UNUSED, + void *opaque) +{ + qemuNbdkitProcess *proc =3D opaque; + + if (virProcessKill(proc->pid, 0) < 0) + qemuNbdkitProcessHandleExit(proc); +} +#endif /* WITH_DECL_SYS_PIDFD_OPEN */ + + +static int +qemuNbdkitProcessStartMonitor(qemuNbdkitProcess *proc, + virDomainObj *vm, + virQEMUDriver *driver) +{ + qemuNbdkitProcessPrivate *priv =3D proc->priv; +#if WITH_DECL_SYS_PIDFD_OPEN + int pidfd; +#endif + + if (vm) { + virObjectRef(vm); + + if (priv->vm) + virObjectUnref(priv->vm); + + priv->vm =3D vm; + } + + if (driver) + priv->driver =3D driver; + + if (!(priv->vm && priv->driver)) { + VIR_WARN("set vm and driver before calling %s", G_STRFUNC); + return -1; + } + +#if WITH_DECL_SYS_PIDFD_OPEN + pidfd =3D syscall(SYS_pidfd_open, proc->pid, 0); + if (pidfd < 0) + return -1; + + priv->monitor =3D virEventAddHandle(pidfd, + VIR_EVENT_HANDLE_READABLE, + qemuNbdkitProcessPidfdCb, + proc, NULL); +#else + /* fall back to checking once a second */ + priv->monitor =3D virEventAddTimeout(1000, + qemuNbdkitProcessTimeoutCb, + proc, NULL); +#endif /* WITH_DECL_SYS_PIDFD_OPEN */ + + if (priv->monitor < 0) + return -1; + + VIR_DEBUG("Monitoring nbdkit process %i for exit", proc->pid); + + return 0; +} + + +static void +qemuNbdkitProcessStopMonitor(qemuNbdkitProcess *proc) +{ + qemuNbdkitProcessPrivate *priv =3D proc->priv; + + if (priv->monitor > 0) { +#if WITH_DECL_SYS_PIDFD_OPEN + virEventRemoveHandle(priv->monitor); +#else + virEventRemoveTimeout(priv->monitor); +#endif /* WITH_DECL_SYS_PIDFD_OPEN */ + priv->monitor =3D 0; + } +} + + +static void +qemuNbdkitProcessPrivateFree(qemuNbdkitProcessPrivate *priv) +{ + virObjectUnref(priv->vm); + g_free(priv); +} + + static qemuNbdkitProcess * qemuNbdkitProcessNew(virStorageSource *source, const char *pidfile, @@ -601,6 +773,7 @@ qemuNbdkitProcessNew(virStorageSource *source, nbdkit->pid =3D -1; nbdkit->pidfile =3D g_strdup(pidfile); nbdkit->socketfile =3D g_strdup(socketfile); + nbdkit->priv =3D g_new0(qemuNbdkitProcessPrivate, 1); =20 return nbdkit; } @@ -627,6 +800,45 @@ qemuNbdkitProcessLoad(virStorageSource *source, } =20 =20 +static int +qemuNbdkitStorageSourceManageProcessOne(virStorageSource *src, + virDomainObj *vm, + virQEMUDriver *driver) +{ + qemuDomainStorageSourcePrivate *srcPriv =3D QEMU_DOMAIN_STORAGE_SOURCE= _PRIVATE(src); + qemuNbdkitProcess *nbdkit; + + if (!srcPriv) + return 0; + + nbdkit =3D srcPriv->nbdkitProcess; + if (nbdkit) { + nbdkit->caps =3D qemuGetNbdkitCaps(nbdkit->priv->driver); + + if (qemuNbdkitProcessStartMonitor(nbdkit, vm, driver) < 0) + return -1; + } + + return 0; +} + + +int +qemuNbdkitStorageSourceManageProcess(virQEMUDriver *driver, + virDomainObj *vm, + virStorageSource *src) +{ + virStorageSource *backing; + + for (backing =3D src->backingStore; backing !=3D NULL; backing =3D bac= king->backingStore) { + if (qemuNbdkitStorageSourceManageProcessOne(backing, vm, driver) <= 0) + return -1; + } + + return qemuNbdkitStorageSourceManageProcessOne(src, vm, driver); +} + + bool qemuNbdkitInitStorageSource(qemuNbdkitCaps *caps, virStorageSource *source, @@ -915,9 +1127,12 @@ qemuNbdkitProcessBuildCommand(qemuNbdkitProcess *proc) void qemuNbdkitProcessFree(qemuNbdkitProcess *proc) { + qemuNbdkitProcessStopMonitor(proc); + g_clear_pointer(&proc->pidfile, g_free); g_clear_pointer(&proc->socketfile, g_free); g_clear_object(&proc->caps); + g_clear_pointer(&proc->priv, qemuNbdkitProcessPrivateFree); g_free(proc); } =20 @@ -988,6 +1203,9 @@ qemuNbdkitProcessStart(qemuNbdkitProcess *proc, goto error; } =20 + if (qemuNbdkitProcessStartMonitor(proc, vm, driver) < 0) + goto error; + return 0; =20 error: @@ -1007,6 +1225,8 @@ qemuNbdkitProcessStop(qemuNbdkitProcess *proc) { int ret; =20 + qemuNbdkitProcessStopMonitor(proc); + if (proc->pid < 0) return 0; =20 diff --git a/src/qemu/qemu_nbdkit.h b/src/qemu/qemu_nbdkit.h index c9af6efcfa..da53138d13 100644 --- a/src/qemu/qemu_nbdkit.h +++ b/src/qemu/qemu_nbdkit.h @@ -65,6 +65,11 @@ qemuNbdkitStartStorageSource(virQEMUDriver *driver, void qemuNbdkitStopStorageSource(virStorageSource *src); =20 +int +qemuNbdkitStorageSourceManageProcess(virQEMUDriver *driver, + virDomainObj *vm, + virStorageSource *src); + bool qemuNbdkitCapsGet(qemuNbdkitCaps *nbdkitCaps, qemuNbdkitCapsFlags flag); @@ -76,6 +81,8 @@ qemuNbdkitCapsSet(qemuNbdkitCaps *nbdkitCaps, #define QEMU_TYPE_NBDKIT_CAPS qemu_nbdkit_caps_get_type() G_DECLARE_FINAL_TYPE(qemuNbdkitCaps, qemu_nbdkit_caps, QEMU, NBDKIT_CAPS, = GObject); =20 +typedef struct _qemuNbdkitProcessPrivate qemuNbdkitProcessPrivate; + struct _qemuNbdkitProcess { qemuNbdkitCaps *caps; virStorageSource *source; @@ -85,6 +92,8 @@ struct _qemuNbdkitProcess { uid_t user; gid_t group; pid_t pid; + + qemuNbdkitProcessPrivate *priv; }; =20 int @@ -107,4 +116,5 @@ qemuNbdkitProcessLoad(virStorageSource *source, const char *pidfile, const char *socketfile); =20 + G_DEFINE_AUTOPTR_CLEANUP_FUNC(qemuNbdkitProcess, qemuNbdkitProcessFree); diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index f405326312..43e828d42f 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -9009,6 +9009,19 @@ qemuProcessReconnect(void *opaque) } } =20 + for (i =3D 0; i < obj->def->ndisks; i++) { + virDomainDiskDef *disk =3D obj->def->disks[i]; + if (qemuNbdkitStorageSourceManageProcess(driver, obj, disk->src) <= 0) + goto error; + } + + if (obj->def->os.loader && obj->def->os.loader->nvram) { + if (qemuNbdkitStorageSourceManageProcess(driver, obj, + obj->def->os.loader->nvr= am) < 0) + goto error; + } + + /* update domain state XML with possibly updated state in virDomainObj= */ if (virDomainObjSave(obj, driver->xmlopt, cfg->stateDir) < 0) goto error; --=20 2.37.3