From nobody Sat Nov 15 15:41:39 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=gmail.com ARC-Seal: i=1; a=rsa-sha256; t=1749390592; cv=none; d=zohomail.com; s=zohoarc; b=N7Q+HWWzUZHYZIQngYb/mfK0fzhHqRlzPcaH9Rt+m1aA94c4IDyQ+vN+Q5KLB9lcObT/zgn8/+QruKSoE9zA5ltO8GO/Q9g40WBV4pKHU4HRxnruerXhjOnNYIv2UjxWsCuCgTr7UFmhkh1JMWMXnu85Cef99511IvaAQ+8IK88= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1749390592; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=s40my1tHbhb+uMlIztjyGLXhAk8PPa5g2mHbHrM8dFc=; b=F5W2q4f6mk2IleD9olyaLNdmP/EbkG9UK785W2Aef60uwUjCsVrI5icUlImR10XBxUuIBULgLaVBu4/rYaP3DnchDxBz65fMpY9t6GtWaJEl6Sy+G63WxJjo5EMdh9vLiFA9Vh/o3DHmLF6Xo++NZp+rDiI/LeugBi6Btkjv+wM= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1749390592585358.2066678445739; Sun, 8 Jun 2025 06:49:52 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1uOGOf-00014u-0c; Sun, 08 Jun 2025 09:49:13 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uOGOd-00012c-EI for qemu-devel@nongnu.org; Sun, 08 Jun 2025 09:49:11 -0400 Received: from mail-wr1-x433.google.com ([2a00:1450:4864:20::433]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1uOGOa-0007VT-IP for qemu-devel@nongnu.org; Sun, 08 Jun 2025 09:49:11 -0400 Received: by mail-wr1-x433.google.com with SMTP id ffacd0b85a97d-3a522224582so2172811f8f.3 for ; Sun, 08 Jun 2025 06:49:07 -0700 (PDT) Received: from localhost (89-88-247-135.abo.bbox.fr. [89.88.247.135]) by smtp.gmail.com with UTF8SMTPSA id ffacd0b85a97d-3a53229de53sm7188499f8f.8.2025.06.08.06.49.04 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 08 Jun 2025 06:49:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1749390546; x=1749995346; darn=nongnu.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=s40my1tHbhb+uMlIztjyGLXhAk8PPa5g2mHbHrM8dFc=; b=QxmzbTuD2gha3elk9ck+1Xq0FqgLNt4AAkslZO+2Zhp72jmCF89bhqEpFdJw2QLNKK nzBWg2eRGcNUgSWokNcGK2+Mgoc0xNTE/NtQtBcmNlwEVryruhZvl0FjFvw85cEeJYIO UAZujNnhPDOyD4DxivONZbJvCUfrgBH8XSYnG3N9kEHms2eJYLpP968mWbQtr4T1Di71 pVdNk+r56cz2TCmjE9i8gWnxmn+7uf4vx6iZ794WhLi7/vlbDi3bJuRJYfI06oSyPq7x bTQKRiYz8+bvn56qVc+demcpGNQnDCq6tfzGbLFoPB7XPed4KMn165NUDkviGtdVfl0V QGng== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1749390546; x=1749995346; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=s40my1tHbhb+uMlIztjyGLXhAk8PPa5g2mHbHrM8dFc=; b=EfxFCxfnCr+p/Kfp7l/58TIA+uToINmeTV1DMl/6DCHc5J1fHfHRGb27H8vPcuZQ9+ ZdKZ+qGo/zJIi+WmTyBmxk0CTZj9dl73/PO1uhGhhaktZdIY6kxd0QaBeiDN9zpmSuWs srEs9xnghg/klkbFB9m8oq/qUTuSNACV2f2Fe3GZEL2SZV+Ghwe1JLutijHqhmhAlMa5 fqCeP5c01nFJcODQ/Mo+8XxD1P2JxO63QNNUQ+XMdhzUTEfnFp2DjJo7dbdcMvkYD8Wk R3x5/swvjC4JPLRn79nmCBpy8h84FXfNVjHdwzuSSiY9qkBUErv8OoEGd1A7414JpBkq lXyA== X-Gm-Message-State: AOJu0Yz1F56c30rifZgXCitFVvnR8zr9PbujK5zbMkLZ4MBHYKyhlQvT xQDdSVHsPZ7jsI8+wD1F44vqsfwevcYT0lIEoa6OgqzqBYnDTgrR71k/P7LacAZQG6U= X-Gm-Gg: ASbGncs8WWCfM664Ue5oZ2KE3UY7eGKwU8a+p9JYXOTZvKLOTAWLG1obPQOHnh5GWIj 9xqGCRcs7QT8nr5AtcFgbgBtWVYFT0ZLHLHcJABI6bG2mvGYPHw1ZTDcgB5M3OBhT397SN3Bxka /vdZlzEY32FiNh38ZzrM9lEv+P1kv7KhGUZutuMgB3VfgRK4KUZ/XzKGZRiVPaXsf5Gud8/UivZ /4oNzZvzkCRrfplrI5t7DVss8hvHcxnnjc+4CM2vRzJ1jcgdWn/aiWfX14KEA/VHLmCwuy6CvdJ IU4vT42PY66kjeKkqtQ2lXJnRKHizDRlxC7d8XOFQ/Osbtvbk7M/UW8HCT/jW5tO4D1eH45D697 KjKweSCjfQf7ViECM X-Google-Smtp-Source: AGHT+IEXeJQAISoNsulURSQhmJ0L0o8HkQ+PmkGLRrsCuxUKOMw/SpZqleIrM3uRbDb9AnNA3lgNLQ== X-Received: by 2002:a05:6000:40de:b0:3a5:26eb:b4af with SMTP id ffacd0b85a97d-3a53189b56fmr7362167f8f.18.1749390545725; Sun, 08 Jun 2025 06:49:05 -0700 (PDT) From: conte.souleymane@gmail.com To: qemu-devel@nongnu.org Cc: eblake@redhat.com, jsnow@redhat.com, peter.maydell@linaro.org, Souleymane Conte Subject: [PATCH] docs/interop: convert text file to restructuredText format Date: Sun, 8 Jun 2025 13:48:43 +0000 Message-ID: <20250608134843.26530-1-conte.souleymane@gmail.com> X-Mailer: git-send-email 2.49.0 MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2a00:1450:4864:20::433; envelope-from=conte.souleymane@gmail.com; helo=mail-wr1-x433.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @gmail.com) X-ZM-MESSAGEID: 1749390594990116600 Content-Type: text/plain; charset="utf-8" From: Souleymane Conte buglink: https://gitlab.com/qemu-project/qemu/-/issues/527 Signed-off-by: Souleymane Conte --- docs/interop/index.rst | 1 + docs/interop/qed_spec.rst | 219 ++++++++++++++++++++++++++++++++++++++ docs/interop/qed_spec.txt | 138 ------------------------ 3 files changed, 220 insertions(+), 138 deletions(-) create mode 100644 docs/interop/qed_spec.rst delete mode 100644 docs/interop/qed_spec.txt diff --git a/docs/interop/index.rst b/docs/interop/index.rst index 5b9b0653b5..447dcea2e5 100644 --- a/docs/interop/index.rst +++ b/docs/interop/index.rst @@ -17,6 +17,7 @@ are useful for making QEMU interoperate with other softwa= re. nbd parallels qcow2 + qed_spec prl-xml pr-helper qmp-spec diff --git a/docs/interop/qed_spec.rst b/docs/interop/qed_spec.rst new file mode 100644 index 0000000000..5d9a503c37 --- /dev/null +++ b/docs/interop/qed_spec.rst @@ -0,0 +1,219 @@ +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +QED Image File Format Specification +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +The file format looks like this:: + + +----------+----------+----------+-----+ + | cluster0 | cluster1 | cluster2 | ... | + +----------+----------+----------+-----+ + +The first cluster begins with the ``header``. The header contains informat= ion +about where regular clusters start; this allows the header to be extensibl= e and +store extra information about the image file. A regular cluster may be=20 +a ``data cluster``, an ``L2``, or an ``L1 table``. L1 and L2 tables are co= mposed +of one or more contiguous clusters. + +Normally the file size will be a multiple of the cluster size. If the fil= e size=20 +is not a multiple, extra information after the last cluster may not be pre= served=20 +if data is written. Legitimate extra information should use space between = the header +and the first regular cluster. + +All fields are little-endian. + +Header +------ + +:: + + Header { + uint32_t magic; /* QED\0 */ +=20 + uint32_t cluster_size; /* in bytes */ + uint32_t table_size; /* for L1 and L2 tables, in clusters */ + uint32_t header_size; /* in clusters */ +=20 + uint64_t features; /* format feature bits */ + uint64_t compat_features; /* compat feature bits */ + uint64_t autoclear_features; /* self-resetting feature bits */ + + uint64_t l1_table_offset; /* in bytes */ + uint64_t image_size; /* total logical image size, in bytes */ +=20 + /* if (features & QED_F_BACKING_FILE) */ + uint32_t backing_filename_offset; /* in bytes from start of header */ + uint32_t backing_filename_size; /* in bytes */ + } + +Field descriptions: +~~~~~~~~~~~~~~~~~~~ + +- ``cluster_size`` must be a power of 2 in range [2^12, 2^26]. +- ``table_size`` must be a power of 2 in range [1, 16]. +- ``header_size`` is the number of clusters used by the header and any add= itional + information stored before regular clusters. +- ``features``, ``compat_features``, and ``autoclear_features`` are file f= ormat + extension bitmaps. They work as follows: + + - An image with unknown ``features`` bits enabled must not be opened. Fi= le format + changes that are not backwards-compatible must use ``features`` bits. + - An image with unknown ``compat_features`` bits enabled can be opened s= afely. + The unknown features are simply ignored and represent backwards-compat= ible + changes to the file format. + - An image with unknown ``autoclear_features`` bits enable can be opened= safely + after clearing the unknown bits. This allows for backwards-compatible = changes + to the file format which degrade gracefully and can be re-enabled agai= n by a + new program later. +- ``l1_table_offset`` is the offset of the first byte of the L1 table in t= he image=20 + file and must be a multiple of ``cluster_size``. +- ``image_size`` is the block device size seen by the guest and must be a = multiple + of 512 bytes. +- ``backing_filename_offset`` and ``backing_filename_size`` describe a str= ing in + (byte offset, byte size) form. It is not NUL-terminated and has no align= ment constraints. + The string must be stored within the first ``header_size`` clusters. The= backing filename + may be an absolute path or relative to the image file. + +Feature bits: +~~~~~~~~~~~~~ + +- ``QED_F_BACKING_FILE =3D 0x01``. The image uses a backing file. +- ``QED_F_NEED_CHECK =3D 0x02``. The image needs a consistency check befor= e use. +- ``QED_F_BACKING_FORMAT_NO_PROBE =3D 0x04``. The backing file is a raw di= sk image + and no file format autodetection should be attempted. This should be us= ed to + ensure that raw backing files are never detected as an image format if t= hey happen + to contain magic constants. + +There are currently no defined ``compat_features`` or ``autoclear_features= `` bits. + +Fields predicated on a feature bit are only used when that feature is set. +The fields always take up header space, regardless of whether or not the f= eature +bit is set. + +Tables +------ + +Tables provide the translation from logical offsets in the block device to= cluster +offsets in the file. + +:: + + #define TABLE_NOFFSETS (table_size * cluster_size / sizeof(uint64_t)) + =20 + Table { + uint64_t offsets[TABLE_NOFFSETS]; + } + +The tables are organized as follows:: + + +----------+ + | L1 table | + +----------+ + ,------' | '------. + +----------+ | +----------+ + | L2 table | ... | L2 table | + +----------+ +----------+ + ,------' | '------. + +----------+ | +----------+ + | Data | ... | Data | + +----------+ +----------+ + +A table is made up of one or more contiguous clusters. The ``table_size``= header +field determines table size for an image file. For example, ``cluster_size= =3D64 KB`` +and ``table_size=3D4`` results in 256 KB tables. + +The logical image size must be less than or equal to the maximum possible = size of=20 +clusters rooted by the L1 table: + +.. code:: + + header.image_size <=3D TABLE_NOFFSETS * TABLE_NOFFSETS * header.cluster_s= ize + +L1, L2, and data cluster offsets must be aligned to ``header.cluster_size`= `. +The following offsets have special meanings: + +L2 table offsets +~~~~~~~~~~~~~~~~ + +- 0 - unallocated. The L2 table is not yet allocated. + +Data cluster offsets +~~~~~~~~~~~~~~~~~~~~ + +- 0 - unallocated. The data cluster is not yet allocated. +- 1 - zero. The data cluster contents are all zeroes and no cluster is all= ocated. + +Future format extensions may wish to store per-offset information. The lea= st +significant 12 bits of an offset are reserved for this purpose and must be= set +to zero. Image files with ``cluster_size`` > 2^12 will have more unused bi= ts=20 +which should also be zeroed. + +Unallocated L2 tables and data clusters +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Reads to an unallocated area of the image file access the backing file. If= there +is no backing file, then zeroes are produced. The backing file may be smal= ler +than the image file and reads of unallocated areas beyond the end of the b= acking +file produce zeroes. + +Writes to an unallocated area cause a new data clusters to be allocated, a= nd a new +L2 table if that is also unallocated. The new data cluster is populated wi= th data=20 +from the backing file (or zeroes if no backing file) and the data being wr= itten. + +Zero data clusters +~~~~~~~~~~~~~~~~~~ + +Zero data clusters are a space-efficient way of storing zeroed regions of = the image. + +Reads to a zero data cluster produce zeroes.=20 + +.. note:: + The difference between an unallocated and a zero data cluster is that = zero data + clusters stop the reading of contents from the backing file. + +Writes to a zero data cluster cause a new data cluster to be allocated. T= he new=20 +data cluster is populated with zeroes and the data being written. + +Logical offset translation +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Logical offsets are translated into cluster offsets as follows:: + + table_bits table_bits cluster_bits + <--------> <--------> <---------------> + +----------+----------+-----------------+ + | L1 index | L2 index | byte offset | + +----------+----------+-----------------+ +=20 + Structure of a logical offset + + offset_mask =3D ~(cluster_size - 1) # mask for the image file byte offset +=20 + def logical_to_cluster_offset(l1_index, l2_index, byte_offset): + l2_offset =3D l1_table[l1_index] + l2_table =3D load_table(l2_offset) + cluster_offset =3D l2_table[l2_index] & offset_mask + return cluster_offset + byte_offset + +Consistency checking +~~~~~~~~~~~~~~~~~~~~ + +This section is informational and included to provide background on the use +of the ``QED_F_NEED_CHECK features`` bit. + +The ``QED_F_NEED_CHECK`` bit is used to mark an image as dirty before star= ting +an operation that could leave the image in an inconsistent state if interr= upted +by a crash or power failure. A dirty image must be checked on open becaus= e its +metadata may not be consistent. + +Consistency check includes the following invariants: + +- Each cluster is referenced once and only once. It is an inconsistency to= have + a cluster referenced more than once by L1 or L2 tables. A cluster has be= en leaked + if it has no references. +- Offsets must be within the image file size and must be ``cluster_size`` = aligned. +- Table offsets must at least ``table_size`` * ``cluster_size`` bytes from= the end=20 + of the image file so that there is space for the entire table. + +The consistency check process starts by from ``l1_table_offset`` and scans= all L2 tables. +After the check completes with no other errors besides leaks, the ``QED_F_= NEED_CHECK`` +bit can be cleared and the image can be accessed. diff --git a/docs/interop/qed_spec.txt b/docs/interop/qed_spec.txt deleted file mode 100644 index 7982e058b2..0000000000 --- a/docs/interop/qed_spec.txt +++ /dev/null @@ -1,138 +0,0 @@ -=3DSpecification=3D - -The file format looks like this: - - +----------+----------+----------+-----+ - | cluster0 | cluster1 | cluster2 | ... | - +----------+----------+----------+-----+ - -The first cluster begins with the '''header'''. The header contains infor= mation about where regular clusters start; this allows the header to be ext= ensible and store extra information about the image file. A regular cluste= r may be a '''data cluster''', an '''L2''', or an '''L1 table'''. L1 and L= 2 tables are composed of one or more contiguous clusters. - -Normally the file size will be a multiple of the cluster size. If the fil= e size is not a multiple, extra information after the last cluster may not = be preserved if data is written. Legitimate extra information should use s= pace between the header and the first regular cluster. - -All fields are little-endian. - -=3D=3DHeader=3D=3D - Header { - uint32_t magic; /* QED\0 */ -=20 - uint32_t cluster_size; /* in bytes */ - uint32_t table_size; /* for L1 and L2 tables, in clusters */ - uint32_t header_size; /* in clusters */ -=20 - uint64_t features; /* format feature bits */ - uint64_t compat_features; /* compat feature bits */ - uint64_t autoclear_features; /* self-resetting feature bits */ - - uint64_t l1_table_offset; /* in bytes */ - uint64_t image_size; /* total logical image size, in bytes */ -=20 - /* if (features & QED_F_BACKING_FILE) */ - uint32_t backing_filename_offset; /* in bytes from start of header */ - uint32_t backing_filename_size; /* in bytes */ - } - -Field descriptions: -* ''cluster_size'' must be a power of 2 in range [2^12, 2^26]. -* ''table_size'' must be a power of 2 in range [1, 16]. -* ''header_size'' is the number of clusters used by the header and any add= itional information stored before regular clusters. -* ''features'', ''compat_features'', and ''autoclear_features'' are file f= ormat extension bitmaps. They work as follows: -** An image with unknown ''features'' bits enabled must not be opened. Fi= le format changes that are not backwards-compatible must use ''features'' b= its. -** An image with unknown ''compat_features'' bits enabled can be opened sa= fely. The unknown features are simply ignored and represent backwards-comp= atible changes to the file format. -** An image with unknown ''autoclear_features'' bits enable can be opened = safely after clearing the unknown bits. This allows for backwards-compatib= le changes to the file format which degrade gracefully and can be re-enable= d again by a new program later. -* ''l1_table_offset'' is the offset of the first byte of the L1 table in t= he image file and must be a multiple of ''cluster_size''. -* ''image_size'' is the block device size seen by the guest and must be a = multiple of 512 bytes. -* ''backing_filename_offset'' and ''backing_filename_size'' describe a str= ing in (byte offset, byte size) form. It is not NUL-terminated and has no = alignment constraints. The string must be stored within the first ''header= _size'' clusters. The backing filename may be an absolute path or relative= to the image file. - -Feature bits: -* QED_F_BACKING_FILE =3D 0x01. The image uses a backing file. -* QED_F_NEED_CHECK =3D 0x02. The image needs a consistency check before u= se. -* QED_F_BACKING_FORMAT_NO_PROBE =3D 0x04. The backing file is a raw disk = image and no file format autodetection should be attempted. This should be= used to ensure that raw backing files are never detected as an image forma= t if they happen to contain magic constants. - -There are currently no defined ''compat_features'' or ''autoclear_features= '' bits. - -Fields predicated on a feature bit are only used when that feature is set.= The fields always take up header space, regardless of whether or not the = feature bit is set. - -=3D=3DTables=3D=3D - -Tables provide the translation from logical offsets in the block device to= cluster offsets in the file. - - #define TABLE_NOFFSETS (table_size * cluster_size / sizeof(uint64_t)) - =20 - Table { - uint64_t offsets[TABLE_NOFFSETS]; - } - -The tables are organized as follows: - - +----------+ - | L1 table | - +----------+ - ,------' | '------. - +----------+ | +----------+ - | L2 table | ... | L2 table | - +----------+ +----------+ - ,------' | '------. - +----------+ | +----------+ - | Data | ... | Data | - +----------+ +----------+ - -A table is made up of one or more contiguous clusters. The table_size hea= der field determines table size for an image file. For example, cluster_si= ze=3D64 KB and table_size=3D4 results in 256 KB tables. - -The logical image size must be less than or equal to the maximum possible = size of clusters rooted by the L1 table: - header.image_size <=3D TABLE_NOFFSETS * TABLE_NOFFSETS * header.cluster_s= ize - -L1, L2, and data cluster offsets must be aligned to header.cluster_size. = The following offsets have special meanings: - -=3D=3D=3DL2 table offsets=3D=3D=3D -* 0 - unallocated. The L2 table is not yet allocated. - -=3D=3D=3DData cluster offsets=3D=3D=3D -* 0 - unallocated. The data cluster is not yet allocated. -* 1 - zero. The data cluster contents are all zeroes and no cluster is al= located. - -Future format extensions may wish to store per-offset information. The le= ast significant 12 bits of an offset are reserved for this purpose and must= be set to zero. Image files with cluster_size > 2^12 will have more unuse= d bits which should also be zeroed. - -=3D=3D=3DUnallocated L2 tables and data clusters=3D=3D=3D -Reads to an unallocated area of the image file access the backing file. I= f there is no backing file, then zeroes are produced. The backing file may= be smaller than the image file and reads of unallocated areas beyond the e= nd of the backing file produce zeroes. - -Writes to an unallocated area cause a new data clusters to be allocated, a= nd a new L2 table if that is also unallocated. The new data cluster is pop= ulated with data from the backing file (or zeroes if no backing file) and t= he data being written. - -=3D=3D=3DZero data clusters=3D=3D=3D -Zero data clusters are a space-efficient way of storing zeroed regions of = the image. - -Reads to a zero data cluster produce zeroes. Note that the difference bet= ween an unallocated and a zero data cluster is that zero data clusters stop= the reading of contents from the backing file. - -Writes to a zero data cluster cause a new data cluster to be allocated. T= he new data cluster is populated with zeroes and the data being written. - -=3D=3D=3DLogical offset translation=3D=3D=3D -Logical offsets are translated into cluster offsets as follows: - - table_bits table_bits cluster_bits - <--------> <--------> <---------------> - +----------+----------+-----------------+ - | L1 index | L2 index | byte offset | - +----------+----------+-----------------+ -=20 - Structure of a logical offset - - offset_mask =3D ~(cluster_size - 1) # mask for the image file byte offset -=20 - def logical_to_cluster_offset(l1_index, l2_index, byte_offset): - l2_offset =3D l1_table[l1_index] - l2_table =3D load_table(l2_offset) - cluster_offset =3D l2_table[l2_index] & offset_mask - return cluster_offset + byte_offset - -=3D=3DConsistency checking=3D=3D - -This section is informational and included to provide background on the us= e of the QED_F_NEED_CHECK ''features'' bit. - -The QED_F_NEED_CHECK bit is used to mark an image as dirty before starting= an operation that could leave the image in an inconsistent state if interr= upted by a crash or power failure. A dirty image must be checked on open b= ecause its metadata may not be consistent. - -Consistency check includes the following invariants: -# Each cluster is referenced once and only once. It is an inconsistency t= o have a cluster referenced more than once by L1 or L2 tables. A cluster h= as been leaked if it has no references. -# Offsets must be within the image file size and must be ''cluster_size'' = aligned. -# Table offsets must at least ''table_size'' * ''cluster_size'' bytes from= the end of the image file so that there is space for the entire table. - -The consistency check process starts by from ''l1_table_offset'' and scans= all L2 tables. After the check completes with no other errors besides lea= ks, the QED_F_NEED_CHECK bit can be cleared and the image can be accessed. --=20 2.49.0