From nobody Mon Feb 9 08:02:54 2026 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of redhat.com designates 207.211.31.120 as permitted sender) client-ip=207.211.31.120; envelope-from=libvir-list-bounces@redhat.com; helo=us-smtp-1.mimecast.com; Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of redhat.com designates 207.211.31.120 as permitted sender) smtp.mailfrom=libvir-list-bounces@redhat.com; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1592241129; cv=none; d=zohomail.com; s=zohoarc; b=bpI9o+bL11JscyM5wdVJT/z1HsGRb4w0MJyqzI91S6omf2D73VJPxpuo9zWjbBgtf0qE+wPqTM79JKM4d5rUldcdhouN665WCOgslP8J93vwysSbD4qXmdd5HzbFSVT9t7zuzx3wfqSDRQ/cRuyzOrJ1ruN9gFcpwzkPlBEKZ04= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1592241129; h=Content-Type:Content-Transfer-Encoding:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=gAlaNkAZ73XbtQp2uEvrgBZnzvBSGma84h+UOm4aJ6w=; b=T5sLhnhB3T8/xOenB29OPhHjhXCOd7vPxn7LubUqspZbNqbx2wrgOEPYvXbhVGqqMNXWyVOLmpNiz4XuFe8xARo+j7qvPgxnokRlbT+rPrnQIKEdNErjC2wkQ6SgcU4grV8x/UCGKC+5wa604h1/mzd3FRb696GFDWcttIzxn9g= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of redhat.com designates 207.211.31.120 as permitted sender) smtp.mailfrom=libvir-list-bounces@redhat.com; dmarc=pass header.from= (p=none dis=none) header.from= Return-Path: Received: from us-smtp-1.mimecast.com (us-smtp-delivery-1.mimecast.com [207.211.31.120]) by mx.zohomail.com with SMTPS id 1592241129927957.2121067043488; Mon, 15 Jun 2020 10:12:09 -0700 (PDT) Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-70-5XAj2UyhOaagdB3r5pyj5A-1; Mon, 15 Jun 2020 13:11:22 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id AD1121138308; Mon, 15 Jun 2020 17:11:13 +0000 (UTC) Received: from colo-mx.corp.redhat.com (colo-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.21]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 9044778FC5; Mon, 15 Jun 2020 17:11:13 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id 62E4695AA4; Mon, 15 Jun 2020 17:11:13 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id 05FHB5Cd005027 for ; Mon, 15 Jun 2020 13:11:05 -0400 Received: by smtp.corp.redhat.com (Postfix) id CC3C3100238D; Mon, 15 Jun 2020 17:11:05 +0000 (UTC) Received: from speedmetal.redhat.com (unknown [10.40.208.90]) by smtp.corp.redhat.com (Postfix) with ESMTP id 46402100164D for ; Mon, 15 Jun 2020 17:11:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1592241128; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post; bh=gAlaNkAZ73XbtQp2uEvrgBZnzvBSGma84h+UOm4aJ6w=; b=T3F4jJ55pN1Wx7cnXvjnAvIdChDfSEi1SyGP8WrB+mAcYE+Is5fgtCMJNI8A8pIiSmygkL iRsklM0HXBqYhk9Aa5X11ubvFtxt8dUZxee52oo5lIBF/5NcQ5aJ+ZSSr6sTrh9VZUAvNP xpUC2yZDQCEZgIhyF+4A8rvhhR4QnXQ= X-MC-Unique: 5XAj2UyhOaagdB3r5pyj5A-1 From: Peter Krempa To: libvir-list@redhat.com Subject: [PATCH 32/32] kbase: Add document outlining internals of incremental backup in qemu Date: Mon, 15 Jun 2020 19:10:19 +0200 Message-Id: <4557db5a4445286a22c6f72ec8a9732968c5dc72.1592240635.git.pkrempa@redhat.com> In-Reply-To: References: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-loop: libvir-list@redhat.com X-BeenThere: libvir-list@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk List-Id: Development discussions about the libvirt library & tools List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: libvir-list-bounces@redhat.com Errors-To: libvir-list-bounces@redhat.com X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable X-ZohoMail-DKIM: pass (identity @redhat.com) Content-Type: text/plain; charset="utf-8" Outline the basics and how to integrate with externally created overlays. Other topics will continue later. Signed-off-by: Peter Krempa --- docs/kbase.html.in | 3 + docs/kbase/incrementalbackupinternals.rst | 210 ++++++++++++++++++++++ 2 files changed, 213 insertions(+) create mode 100644 docs/kbase/incrementalbackupinternals.rst diff --git a/docs/kbase.html.in b/docs/kbase.html.in index c586e0f676..4257e52b7e 100644 --- a/docs/kbase.html.in +++ b/docs/kbase.html.in @@ -36,6 +36,9 @@
Virtio-FS
Share a filesystem between the guest and the host
+ +
Incremental = backup internals
+
Incremental backup implementation details relevant for users diff --git a/docs/kbase/incrementalbackupinternals.rst b/docs/kbase/increme= ntalbackupinternals.rst new file mode 100644 index 0000000000..adf12002d2 --- /dev/null +++ b/docs/kbase/incrementalbackupinternals.rst @@ -0,0 +1,210 @@ +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +Internals of incremental backup handling in qemu +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +.. contents:: + +Libvirt's implementation of incremental backups in the ``qemu`` driver uses +qemu's ``block-dirty-bitmaps`` under the hood to track the guest visible d= isk +state changes correspoiding to the points in time described by a libvirt +checkpoint. + +There are some semantical implications how libvirt creates and manages the +bitmaps which de-facto become API as they are written into the disk images= and +this document will try to sumarize them. + +Glossary +=3D=3D=3D=3D=3D=3D=3D=3D + +Checkpoint + + A libvirt object which represents a named point in time of the life of= the + vm where libvirt tracks writes the VM has done and allows then a backu= p of + block which changed. Note that state of the VM memory is _not_ capture= d. + + A checkpoint can be created either explicitly via the corresponding API + which isn't very useful or is created as part of creating an + incremental or full backup of the VM using the ``virDomainBackupBegin`= ` API + which allows a next backup to only copy the differences. + +Backup + + A copy of either all blocks of selected disks (full backup) or blocks = changed + since a checkpoint (incremental backup) at the time the backup job was + started. (Blocks modified while the backup job is running are not part= of the + backup!) + +Snapshot + + Similarly to a checkpoint it's a point in time in the lifecycle of the= VM + but the state of the VM including memory is captured at that point all= owing + returning to the state later. + +Blockjob + + A long running job which modifies the shape and/or location of the disk + backing chain (images storing the disk contents). Libvirt supports + ``block pull`` where data is moved up the chain towards the active lay= er, + ``block commit`` where data is moved down the chain towards the base/o= ldest + image. These blockjobs always remove images from the backing chain. La= stly + ``block copy`` where image is moved to a different location (and possi= bly + collapsed moving all of the data into the new location into the one im= age). + +block-dirty-bitmap (bitmap) + + A data structure in qemu tracking which blocks were written by the gue= st + OS since the bitmap was created. + +Relationships of bitmaps, checkpoints and VM disks +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D + +When a checkpoint is created libvirt creates a block-dirty-bitmap for every +configured VM disk named the same way as chcheckpoint. The bitmap is activ= ely +recording which blocks were changed by the guest OS from that point on. Ot= her +bitmaps are not impacted by any way as they are self-contained: + +:: + + +----------------+ +----------------+ + | disk: vda | | disk: vdb | + +--------+-------+ +--------+-------+ + | | + +--------v-------+ +--------v-------+ + | vda-1.qcow2 | | vdb-1.qcow2 | + | | | | + | bitmaps: chk-a | | bitmaps: chk-a | + | chk-b | | chk-b | + | | | | + +----------------+ +----------------+ + +Bitmaps are created at the same time to track changes to all disks in sync= and +are active and persisted in the QCOW2 image. Oter formats currently don't +support this feature. + +Modification of bitmaps outside of libvirt is not recommended, but when ad= rering +to the same semantics which the document will describe it should be safe t= o do +so but obviously we can't guarantee that. + + +Integration with external snapshots +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +Handling of bitmaps +------------------- + +Creating an external snapshot involves adding a new layer to the backing c= hain +on top of the previous chain. In this step there are no new bitmaps create= d by +default, which would mean that backups become impossible after this step. + +To prevent this from happening we need to re-create the active bitmaps in = the +new top/active layer of the backing chain which allows us to continue trac= king +the changes with same granularity as before and also allows libvirt to sti= tch +together all the corresponding bitmaps to do a backup acorss snapshots. + +After taking a snapshot of the ``vda`` disk from the example above placed = into +``vda-2.qcow2`` the following topology will be created: + +:: + + +----------------+ + | disk: vda | + +-------+--------+ + | + +-------v--------+ +----------------+ + | vda-2.qcow2 | | vda-1.qcow2 | + | | | | + | bitmaps: chk-a +----> bitmaps: chk-a | + | chk-b | | chk-b | + | | | | + +----------------+ +----------------+ + +Checking bitmap health +---------------------- + +QEMU optimizes disk writes by only updating the bitmaps in certain cases. = This +also can cause problems in cases when e.g. QEMU crashes. + +For a chain of bitmaps corresponding in a backing chain to be considered v= alid +and eligible for use with ``virDomainBackupBegin`` it must conform to the +following rules: + +1) Top image must contain the bitmap +2) If any of the backing images in the chain contain the bitmap too all + contiguous images must have the bitmap (no gaps) +3) all of the above bitmaps must be marked as active + (``auto`` flag in ``qemu-img`` output, ``recording`` in qemu) +4) none of the above bitmaps can be inconsistent + (``in-use`` flag in ``qemu-img`` provided that it's not used on image w= hich + is currently in use by a qemu instance, or ``inconsistent`` in qemu) + +:: + + # check that image has bitmaps + $ qemu-img info vda-1.qcow2 + image: vda-1.qcow2 + file format: qcow2 + virtual size: 100 MiB (104857600 bytes) + disk size: 220 KiB + cluster_size: 65536 + Format specific information: + compat: 1.1 + compression type: zlib + lazy refcounts: false + bitmaps: + [0]: + flags: + [0]: in-use + [1]: auto + name: chk-a + granularity: 65536 + [1]: + flags: + [0]: auto + name: chk-b + granularity: 65536 + refcount bits: 16 + corrupt: false + +(See also the ``qemuBlockBitmapChainIsValid`` helper method in +``src/qemu/qemu_block.c``) + +Creating external checkpoints manually +-------------------------------------- + +To create the same topology outside of libvirt (e.g when doing snapshots o= ffline) +a new ``qemu-img`` which supports the ``bitmap`` subcomand is necessary. T= he +following algorithm then ensures that the new image after snapshot will wo= rk +with backups (note that ``jq`` is a JSON processor): + +:: + + # arguments + SNAP_IMG=3D"vda-2.qcow2" + BACKING_IMG=3D"vda-1.qcow2" + + # constants - snapshots and bitmaps work only with qcow2 + SNAP_FMT=3D"qcow2" + BACKING_IMG_FMT=3D"qcow2" + + # create snapshot overlay + qemu-img create -f "$SNAP_FMT" -F "$BACKING_IMG_FMT" -b "$BACKING_IMG" "= $SNAP_IMG" + + BACKING_IMG_INFO=3D$(qemu-img info --output=3Djson -f "$BACKING_IMG_FMT"= "$BACKING_IMG") + BACKING_BITMAPS=3D$(jq '."format-specific".data.bitmaps' <<< "$BACKING_I= MG_INFO") + + if [ "x$BACKING_BITMAPS" =3D=3D "xnull" ]; then + exit 0 + fi + + for BACKING_BITMAP_ in $(jq -c '.[]' <<< "$BACKING_BITMAPS"); do + BITMAP_FLAGS=3D$(jq -c -r '.flags[]' <<< "$BACKING_BITMAP_") + BITMAP_NAME=3D$(jq -r '.name' <<< "$BACKING_BITMAP_") + + if grep 'in-use' <<< "$BITMAP_FLAGS" || + grep -v 'auto' <<< "$BITMAP_FLAGS"; then + continue + fi + + qemu-img bitmap -f "$SNAP_FMT" "$SNAP_IMG" --add "$BITMAP_NAME" + + done --=20 2.26.2