From nobody Sun Feb 8 23:42:39 2026 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; envelope-from=libvir-list-bounces@redhat.com; helo=mx1.redhat.com; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=libvir-list-bounces@redhat.com; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1562471812; cv=none; d=zoho.com; s=zohoarc; b=GQ3gb2vomiaKdSeiC97RImMhH66CuYJ98v7KbdPV+gBnG/kSm1DSNPyvSiLavLp37/ueQOgQlSSyX80vcrxW7XzjCVs4x3vr/o32fFO8oD/F9ERtsZDxLH35ZK3jnTFrCViw1mSfllBQYy0oEGMhUe5kFE6UfPHUIhT5z0pMU2U= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1562471812; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To:ARC-Authentication-Results; bh=XJd/LAJ1uGDbAKRufyJncAyPZsDSom1Cak2GsR8q6A4=; b=TfnkGhcws12oiaRBy/Q/4WYpE1+J/h9XEKlYlqVm4Tx/uqU4OLatrHtfb/aQe0M/As07MR+eM7cE4P3xsdUhgQ5O68c1kbzwFeVNt2NW4gK1a9N8R7tm9Ula5eIN8lHORbDy7/TXNik0VLUksn7sDi+sGxLLw3OI7AZRPRchfnY= ARC-Authentication-Results: i=1; mx.zoho.com; spf=pass (zoho.com: domain of redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=libvir-list-bounces@redhat.com; dmarc=pass header.from= (p=none dis=none) header.from= Return-Path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by mx.zohomail.com with SMTPS id 1562471812730701.0563421913054; Sat, 6 Jul 2019 20:56:52 -0700 (PDT) Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id DF6F6C055676; Sun, 7 Jul 2019 03:56:42 +0000 (UTC) Received: from colo-mx.corp.redhat.com (colo-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.20]) by smtp.corp.redhat.com (Postfix) with ESMTPS id BCA5651F0A; Sun, 7 Jul 2019 03:56:42 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id 8252A18184A5; Sun, 7 Jul 2019 03:56:42 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id x673uOrA006207 for ; Sat, 6 Jul 2019 23:56:24 -0400 Received: by smtp.corp.redhat.com (Postfix) id 621912B599; Sun, 7 Jul 2019 03:56:24 +0000 (UTC) Received: from blue.redhat.com (ovpn-116-78.phx2.redhat.com [10.3.116.78]) by smtp.corp.redhat.com (Postfix) with ESMTP id B92BF413A; Sun, 7 Jul 2019 03:56:23 +0000 (UTC) From: Eric Blake To: libvir-list@redhat.com Date: Sat, 6 Jul 2019 22:56:03 -0500 Message-Id: <20190707035613.25754-4-eblake@redhat.com> In-Reply-To: <20190707035613.25754-1-eblake@redhat.com> References: <20190707035613.25754-1-eblake@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-loop: libvir-list@redhat.com Cc: nsoffer@redhat.com, eshenitz@redhat.com, pkrempa@redhat.com Subject: [libvirt] [PATCH v9 03/13] backup: Document nuances between different state capture APIs X-BeenThere: libvir-list@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk List-Id: Development discussions about the libvirt library & tools List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Sender: libvir-list-bounces@redhat.com Errors-To: libvir-list-bounces@redhat.com X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.32]); Sun, 07 Jul 2019 03:56:43 +0000 (UTC) Now that various new API have been added or are coming soon, it is worth a landing page that gives an overview of capturing various pieces of guest state, and which APIs are best suited to which tasks. Signed-off-by: Eric Blake Reviewed-by: John Ferlan Reviewed-by: Daniel P. Berrang=C3=A9 --- docs/docs.html.in | 5 + docs/domainstatecapture.html.in | 314 ++++++++++++++++++++++++++++++++ docs/formatcheckpoint.html.in | 4 +- docs/formatsnapshot.html.in | 2 + 4 files changed, 324 insertions(+), 1 deletion(-) create mode 100644 docs/domainstatecapture.html.in diff --git a/docs/docs.html.in b/docs/docs.html.in index 2d44e3ab2e..a00b131466 100644 --- a/docs/docs.html.in +++ b/docs/docs.html.in @@ -124,6 +124,11 @@
Secure usage
Secure usage of the libvirt APIs
+ +
Domain state + capture
+
Comparison between different methods of capturing domain + state
diff --git a/docs/domainstatecapture.html.in b/docs/domainstatecapture.html= .in new file mode 100644 index 0000000000..4f109180da --- /dev/null +++ b/docs/domainstatecapture.html.in @@ -0,0 +1,314 @@ + + + + + +

Domain state capture using Libvirt

+ +
    + +

    + In order to aid application developers to choose which + operations best suit their needs, this page compares the + different means for capturing state related to a domain managed + by libvirt. +

    + +

    + The information here is primarily geared towards capturing the + state of an active domain. Capturing the state of an inactive + domain essentially amounts to copying the contents of guest + disks, followed by a fresh boot with disks restored to that + state. +

    + +

    State capture trade-offs

    + +

    One of the features made possible with virtual machines is live + migration -- transferring all state related to the guest from + one host to another with minimal interruption to the guest's + activity. In this case, state includes domain memory (including + register and device contents), and domain storage (whether the + guest's view of the disks are backed by local storage on the + host, or by the hypervisor accessing shared storage over a + network). A clever observer will then note that if all state is + available for live migration, then there is nothing stopping a + user from saving some or all of that state at a given point of + time in order to be able to later rewind guest execution back to + the state it previously had. The astute reader will also realize + that state capture at any level requires that the data must be + stored and managed by some mechanism. This processing might fit + in a single file, or more likely require a chain of related + files, and may require synchronization with third-party tools + built around managing the amount of data resulting from + capturing the state of multiple guests that each use multiple + disks. +

    + +

    + There are several libvirt APIs associated with capturing the + state of a guest, which can later be used to rewind that guest + to the conditions it was in earlier. The following is a list of + trade-offs and differences between the various facets that + affect capturing domain state for active domains: +

    + +
    +
    Duration
    +
    Capturing state can be a lengthy process, so while the + captured state ideally represents an atomic point in time + corresponding to something the guest was actually executing, + capturing state tends to focus on minimizing guest downtime + while performing the rest of the state capture in parallel + with guest execution. Some interfaces require up-front + preparation (the state captured is not complete until the API + ends, which may be some time after the command was first + started), while other interfaces track the state when the + command was first issued, regardless of the time spent in + capturing the rest of the state. Also, time spent in state + capture may be longer than the time required for live + migration, when state must be duplicated rather than shared. +
    + +
    Amount of state
    +
    For an online guest, there is a choice between capturing the + guest's memory (all that is needed during live migration when + the storage is already shared between source and destination), + the guest's disk state (all that is needed if there are no + pending guest I/O transactions that would be lost without the + corresponding memory state), or both together. Reverting to + partial state may still be viable, but typically, booting from + captured disk state without corresponding memory is comparable + to rebooting a machine that had power cut before I/O could be + flushed. Guests may need to use proper journaling methods to + avoid problems when booting from partial state. +
    + +
    Quiescing of data
    +
    Even if a guest has no pending I/O, capturing disk state may + catch the guest at a time when the contents of the disk are + inconsistent. Cooperating with the guest to perform data + quiescing is an optional step to ensure that captured disk + state is fully consistent without requiring additional memory + state, rather than just crash-consistent. But guest + cooperation may also have time constraints, where the guest + can rightfully panic if there is too much downtime while I/O + is frozen. +
    + +
    Quantity of files
    +
    When capturing state, some approaches store all state within + the same file (internal), while others expand a chain of + related files that must be used together (external), for more + files that a management application must track. +
    + +
    Impact to guest definition
    +
    Capturing state may require temporary changes to the guest + definition, such as associating new files into the domain + definition. While state capture should never impact the + running guest, a change to the domain's active XML may have + impact on other host operations being performed on the domain. +
    + +
    Third-party integration
    +
    When capturing state, there are tradeoffs to how much of the + process must be done directly by the hypervisor, and how much + can be off-loaded to third-party software. Since capturing + state is not instantaneous, it is essential that any + third-party integration see consistent data even if the + running guest continues to modify that data after the point in + time of the capture.
    + +
    Full vs. incremental
    +
    When periodically repeating the action of state capture, it + is useful to minimize the amount of state that must be + captured by exploiting the relation to a previous capture, + such as focusing only on the portions of the disk that the + guest has modified in the meantime. Some approaches are able + to take advantage of checkpoints to provide an incremental + backup, while others are only capable of a full backup even if + that means re-capturing unchanged portions of the disk.
    + +
    Local vs. remote
    +
    Domains that completely use remote storage may only need + some mechanism to keep track of guest memory state while using + external means to manage storage. Still, hypervisor and guest + cooperation to ensure points in time when no I/O is in flight + across the network can be important for properly capturing + disk state.
    + +
    Network latency
    +
    Whether it's domain storage or saving domain state into + remote storage, network latency has an impact on snapshot + data. Having dedicated network capacity, bandwidth, or quality + of service levels may play a role, as well as planning for how + much of the backup process needs to be local.
    +
    + +

    + An example of the various facets in action is migration of a + running guest. In order for the guest to be able to resume on + the destination at the same place it left off at the source, the + hypervisor has to get to a point where execution on the source + is stopped, the last remaining changes occurring since the + migration started are then transferred, and the guest is started + on the target. The management software thus must keep track of + the starting point and any changes since the starting + point. These last changes are often referred to as dirty page + tracking or dirty disk block bitmaps. At some point in time + during the migration, the management software must freeze the + source guest, transfer the dirty data, and then start the guest + on the target. This period of time must be minimal. To minimize + overall migration time, one is advised to use a dedicated + network connection with a high quality of service. Alternatively + saving the current state of the running guest can just be a + point in time type operation which doesn't require updating the + "last vestiges" of state prior to writing out the saved state + file. The state file is the point in time of whatever is current + and may contain incomplete data which if used to restart the + guest could cause confusion or problems because some operation + wasn't completed depending upon where in time the operation was + commenced. +

    + +

    State capture APIs

    +

    With those definitions, the following libvirt APIs related to + state capture have these properties:

    +
    +
    virDomainManagedSave
    +
    This API saves guest memory, with libvirt managing all of + the saved state, then stops the guest. While stopped, the + disks can be copied by a third party. However, since any + subsequent restart of the guest by libvirt API will restore + the memory state (which typically only works if the disk state + is unchanged in the meantime), and since it is not possible to + get at the memory state that libvirt is managing, this is not + viable as a means for rolling back to earlier saved states, + but is rather more suited to situations such as suspending a + guest prior to rebooting the host in order to resume the guest + when the host is back up. This API also has a drawback of + potentially long guest downtime, and therefore does not lend + itself well to live backups.
    + +
    = virDomainSave
    +
    This API is similar to virDomainManagedSave(), but moves the + burden on managing the stored memory state to the user. As + such, the user can now couple saved state with copies of the + disks to perform a revert to an arbitrary earlier saved state. + However, changing who manages the memory state does not change + the drawback of potentially long guest downtime when capturing + state.
    + +
    virDomainSnapshotCreateXML
    +
    This API wraps several approaches for capturing guest state, + with a general premise of creating a snapshot (where the + current guest resources are frozen in time and a new wrapper + layer is opened for tracking subsequent guest changes). It + can operate on both offline and running guests, can choose + whether to capture the state of memory, disk, or both when + used on a running guest, and can choose between internal and + external storage for captured state. However, it is geared + towards post-event captures (when capturing both memory and + disk state, the disk state is not captured until all memory + state has been collected first). Using QEMU as the + hypervisor, internal snapshots currently have lengthy downtime + that is incompatible with freezing guest I/O, but external + snapshots are quick. Since creating an external snapshot + changes which disk image resource is in use by the guest, this + API can be coupled with virDomainBlockCommit() to + restore things back to the guest using its original disk + image, where a third-party tool can read the backing file + prior to the live commit. See also + the XML details used with + this command.
    + +
    virDomainFSFreeze, virDomainFSThaw
    +
    This pair of APIs does not directly capture guest state, but + can be used to coordinate with a trusted live guest that state + capture is about to happen, and therefore guest I/O should be + quiesced so that the state capture is fully consistent, rather + than merely crash consistent. Some APIs are able to + automatically perform a freeze and thaw via a flags parameter, + rather than having to make separate calls to these + functions. Also, note that freezing guest I/O is only possible + with trusted guests running a guest agent, and that some + guests place maximum time limits on how long I/O can be + frozen.
    + +
    <= code>virDomainBlockCopy
    +
    This API wraps approaches for capturing the disk state of a + running guest, but does not track accompanying guest memory + state, and can only operate on one block device per job. To + get a consistent copy of multiple disks, multiple jobs must be + run in parallel, then the domain must be paused before ending + all of the jobs. The capture is consistent only at the end of + the operation with a choice for future guest changes to either + pivot to the new file or to resume to just using the original + file. The resulting backup file is thus the other file no + longer in use by the guest.
    + +
    virDomainCheckpointCreateXML
    +
    This API does not actually capture guest state, rather it + makes it possible to track which portions of guest disks have + changed between a checkpoint and the current live execution of + the guest. However, while it is possible use this API to + create checkpoints in isolation, it is more typical to create + a checkpoint as a side-effect of starting a new incremental + backup with virDomainBackupBegin() or at the + creation of an external snapshot + with virDomainSnapshotCreateXML2(), since a + second incremental backup is most useful when using the + checkpoint created during the first. See also + the XML details used with + this command.
    + +
    virDomainBackupBegin, virDomainBackupEnd
    +
    This API wraps approaches for capturing the state of disks + of a running guest, but does not track accompanying guest + memory state. The capture is consistent to the start of the + operation, where the captured state is stored independently + from the disk image in use with the guest and where it can be + easily integrated with a third-party for capturing the disk + state. Since the backup operation is stored externally from + the guest resources, there is no need to commit data back in + at the completion of the operation. When coupled with + checkpoints, this can be used to capture incremental backups + instead of full.
    +
    + +

    Examples

    +

    The following two sequences both accomplish the task of + capturing the disk state of a running guest, then wrapping + things up so that the guest is still running with the same file + as its disk image as before the sequence of operations began. + The difference between the two sequences boils down to the + impact of an unexpected interruption made at any point in the + middle of the sequence: with such an interruption, the first + example leaves the guest tied to a temporary wrapper file rather + than the original disk, and requires manual clean up of the + domain definition; while the second example has no impact to the + domain definition.

    + +

    1. Backup via temporary snapshot +

    +virDomainFSFreeze()
    +virDomainSnapshotCreateXML(VIR_DOMAIN_SNAPSHOT_CREATE_DISK_ONLY)
    +virDomainFSThaw()
    +third-party copy the backing file to backup storage # most time spent here
    +virDomainBlockCommit(VIR_DOMAIN_BLOCK_COMMIT_ACTIVE) per disk
    +wait for commit ready event per disk
    +virDomainBlockJobAbort() per disk
    +      

    + +

    2. Direct backup +

    +virDomainFSFreeze()
    +virDomainBackupBegin()
    +virDomainFSThaw()
    +wait for push mode event, or pull data over NBD # most time spent here
    +virDomainBackupEnd()
    +    

    + + + diff --git a/docs/formatcheckpoint.html.in b/docs/formatcheckpoint.html.in index ef5f8a826b..030f0d6af0 100644 --- a/docs/formatcheckpoint.html.in +++ b/docs/formatcheckpoint.html.in @@ -13,7 +13,9 @@ incremental backups. Right now, incremental backups are only supported for the QEMU hypervisor when using qcow2 disks at the active layer; if other disk formats are in use, capturing disk - backups requires different libvirt APIs. + backups requires different libvirt APIs + (see domain state capture + for a comparison between APIs).

    Libvirt is able to facilitate incremental backups by tracking diff --git a/docs/formatsnapshot.html.in b/docs/formatsnapshot.html.in index 2bfb69cf49..ee9aa7817f 100644 --- a/docs/formatsnapshot.html.in +++ b/docs/formatsnapshot.html.in @@ -9,6 +9,8 @@

    Snapshot XML

    + Snapshots are one form + of domain state capture. There are several types of snapshots:

    --=20 2.20.1 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list