From nobody Mon Feb 9 10:52:59 2026 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1571153738; cv=none; d=zoho.com; s=zohoarc; b=opXl7q7+uFO7bpizA2wOrV2vFVxEg/75oCv+VxG+X0v28Oi7qH07336ZrvyW0hyDwmbhbQbZ/xaLIxhBU83CWqb7EyQ3dz44Ru6XaA+ujq6nXUtaSG1kTFKYnpdmC6rrnz1YB/Ct1XbI/E03HESONhkbKEOuSQbjjQJuYN/iJX0= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1571153738; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=JKZDe4shtXRU+8kUbCZNNXhzpqNAhsbhejqfg2Rzb1c=; b=CW0UCU4dXS7mvkR4zYHAFG/96VNpRtcxX1n8GwA+vQA3IH2zE35k1TZQmA9Dla8hOgDoMpusssgKGGnHuT30WWecdsoQN61za4u2Secvinmw3HY0JSGhWBH3EispODU37SPGKHrsbD/76TTRFFYx9Y7jFHXEtCHoB6pPiS/f7RA= ARC-Authentication-Results: i=1; mx.zoho.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1571153738923672.9283907072373; Tue, 15 Oct 2019 08:35:38 -0700 (PDT) Received: from localhost ([::1]:48848 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1iKOrD-0005HF-Eo for importer@patchew.org; Tue, 15 Oct 2019 11:35:31 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48130) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1iKOgz-0002l1-7I for qemu-devel@nongnu.org; Tue, 15 Oct 2019 11:25:00 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1iKOgw-0003gH-Tj for qemu-devel@nongnu.org; Tue, 15 Oct 2019 11:24:57 -0400 Received: from fanzine.igalia.com ([178.60.130.6]:47108) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1iKOgw-0003ZH-54; Tue, 15 Oct 2019 11:24:54 -0400 Received: from 82-181-115-92.bb.dnainternet.fi ([82.181.115.92] helo=perseus.local) by fanzine.igalia.com with esmtpsa (Cipher TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim) id 1iKOgd-0003aQ-OM; Tue, 15 Oct 2019 17:24:35 +0200 Received: from berto by perseus.local with local (Exim 4.92) (envelope-from ) id 1iKOfm-000619-J7; Tue, 15 Oct 2019 18:23:42 +0300 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From; bh=JKZDe4shtXRU+8kUbCZNNXhzpqNAhsbhejqfg2Rzb1c=; b=YgFiOy3u9KJ3uA3nWPf1sAMBQNTLQwSLEAxc8LmKTjivhwhAKPpIQPeVvsHK5d0pXGvEMecTLI7knrG3fkcfYt4h8NOi7elgXaOsWmU/DMlW/q3QlWDKPPOB1M/sUDPP2mkLwN24+eNF4mXkvgLeTfpu20md2r088U+yko7cXca3sG3uklz4IfZ5KCRBMmnbOaezufIO+mpUGFx3LslCQkljvXmTkQKkzMIlBIdsYhFyjJu/hFtD4OfQq2TXxHkrQsWsCawt1VxdYE1V+ueuGlkHbKRPjznewaZ1mpUWEcDi0VYzP8sb3rJtO25t5nP5O9l5ySthkRRIwnaksUaj5A==; From: Alberto Garcia To: qemu-devel@nongnu.org Subject: [RFC PATCH 03/23] qcow2: Process QCOW2_CLUSTER_ZERO_ALLOC clusters in handle_copied() Date: Tue, 15 Oct 2019 18:23:14 +0300 Message-Id: <21379e61f052c9f8c2f4d13eb0c079195f4531d2.1571152571.git.berto@igalia.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x (no timestamps) [generic] [fuzzy] X-Received-From: 178.60.130.6 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Anton Nefedov , Alberto Garcia , qemu-block@nongnu.org, Max Reitz , "Denis V . Lunev" Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) Content-Type: text/plain; charset="utf-8" When writing to a qcow2 file there are two functions that take a virtual offset and return a host offset, possibly allocating new clusters if necessary: - handle_copied() looks for normal data clusters that are already allocated and have a reference count of 1. In those clusters we can simply write the data and there is no need to perform any copy-on-write. - handle_alloc() looks for clusters that do need copy-on-write, either because they haven't been allocated yet, because their reference count is !=3D 1 or because they are ZERO_ALLOC clusters. The ZERO_ALLOC case is a bit special because those are clusters that are already allocated and they could perfectly be dealt with in handle_copied() (as long as copy-on-write is performed when required). In fact, there is extra code specifically for them in handle_alloc() that tries to reuse the existing allocation if possible and frees them otherwise. This patch changes the handling of ZERO_ALLOC clusters so the semantics of these two functions are now like this: - handle_copied() looks for clusters that are already allocated and which we can overwrite (NORMAL and ZERO_ALLOC clusters with a reference count of 1). - handle_alloc() looks for clusters for which we need a new allocation (all other cases). One importante difference after this change is that clusters found in handle_copied() may now require copy-on-write, but this will be anyway necessary once we add support for subclusters. Signed-off-by: Alberto Garcia --- block/qcow2-cluster.c | 177 +++++++++++++++++++++++------------------- 1 file changed, 96 insertions(+), 81 deletions(-) diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c index f462e169c0..70b2e32f7e 100644 --- a/block/qcow2-cluster.c +++ b/block/qcow2-cluster.c @@ -1021,7 +1021,8 @@ void qcow2_alloc_cluster_abort(BlockDriverState *bs, = QCowL2Meta *m) =20 /* * For a given write request, create a new QCowL2Meta structure and - * add it to @m. + * add it to @m. If the write request does not need copy-on-write or + * changes to the L2 metadata then this function does nothing. * * @host_offset points to the beginning of the first cluster. * @@ -1034,15 +1035,51 @@ void qcow2_alloc_cluster_abort(BlockDriverState *bs= , QCowL2Meta *m) */ static void calculate_l2_meta(BlockDriverState *bs, uint64_t host_offset, uint64_t guest_offset, uint64_t bytes, - QCowL2Meta **m, bool keep_old) + uint64_t *l2_slice, QCowL2Meta **m, bool kee= p_old) { BDRVQcow2State *s =3D bs->opaque; - unsigned cow_start_from =3D 0; + int l2_index =3D offset_to_l2_slice_index(s, guest_offset); + uint64_t l2_entry; + unsigned cow_start_from, cow_end_to; unsigned cow_start_to =3D offset_into_cluster(s, guest_offset); unsigned cow_end_from =3D cow_start_to + bytes; - unsigned cow_end_to =3D ROUND_UP(cow_end_from, s->cluster_size); unsigned nb_clusters =3D size_to_clusters(s, cow_end_from); QCowL2Meta *old_m =3D *m; + QCow2ClusterType type; + + /* Return if there's no COW (all clusters are normal and we keep them)= */ + if (keep_old) { + int i; + for (i =3D 0; i < nb_clusters; i++) { + l2_entry =3D be64_to_cpu(l2_slice[l2_index + i]); + if (qcow2_get_cluster_type(bs, l2_entry) !=3D QCOW2_CLUSTER_NO= RMAL) { + break; + } + } + if (i =3D=3D nb_clusters) { + return; + } + } + + /* Get the L2 entry from the first cluster */ + l2_entry =3D be64_to_cpu(l2_slice[l2_index]); + type =3D qcow2_get_cluster_type(bs, l2_entry); + + if (type =3D=3D QCOW2_CLUSTER_NORMAL && keep_old) { + cow_start_from =3D cow_start_to; + } else { + cow_start_from =3D 0; + } + + /* Get the L2 entry from the last cluster */ + l2_entry =3D be64_to_cpu(l2_slice[l2_index + nb_clusters - 1]); + type =3D qcow2_get_cluster_type(bs, l2_entry); + + if (type =3D=3D QCOW2_CLUSTER_NORMAL && keep_old) { + cow_end_to =3D cow_end_from; + } else { + cow_end_to =3D ROUND_UP(cow_end_from, s->cluster_size); + } =20 *m =3D g_malloc0(sizeof(**m)); **m =3D (QCowL2Meta) { @@ -1068,18 +1105,18 @@ static void calculate_l2_meta(BlockDriverState *bs,= uint64_t host_offset, QLIST_INSERT_HEAD(&s->cluster_allocs, *m, next_in_flight); } =20 -/* Returns true if writing to a cluster requires COW */ +/* Returns true if the cluster is unallocated or has refcount > 1 */ static bool cluster_needs_cow(BlockDriverState *bs, uint64_t l2_entry) { switch (qcow2_get_cluster_type(bs, l2_entry)) { case QCOW2_CLUSTER_NORMAL: + case QCOW2_CLUSTER_ZERO_ALLOC: if (l2_entry & QCOW_OFLAG_COPIED) { return false; } case QCOW2_CLUSTER_UNALLOCATED: case QCOW2_CLUSTER_COMPRESSED: case QCOW2_CLUSTER_ZERO_PLAIN: - case QCOW2_CLUSTER_ZERO_ALLOC: return true; default: abort(); @@ -1087,20 +1124,34 @@ static bool cluster_needs_cow(BlockDriverState *bs,= uint64_t l2_entry) } =20 /* - * Returns the number of contiguous clusters that can be used for an alloc= ating - * write, but require COW to be performed (this includes yet unallocated s= pace, - * which must copy from the backing file) + * Returns the number of contiguous clusters that can be written to + * using one single write request, starting from @l2_index. + * At most @nb_clusters are checked. + * + * If @want_cow is true this counts clusters that are either + * unallocated, or allocated but with refcount > 1. + * + * If @want_cow is false this counts clusters that are already + * allocated and can be written to using their current locations + * (including QCOW2_CLUSTER_ZERO_ALLOC). */ static int count_cow_clusters(BlockDriverState *bs, int nb_clusters, - uint64_t *l2_slice, int l2_index) + uint64_t *l2_slice, int l2_index, bool want_= cow) { + BDRVQcow2State *s =3D bs->opaque; + uint64_t l2_entry =3D be64_to_cpu(l2_slice[l2_index]); + uint64_t expected_offset =3D l2_entry & L2E_OFFSET_MASK; int i; =20 for (i =3D 0; i < nb_clusters; i++) { - uint64_t l2_entry =3D be64_to_cpu(l2_slice[l2_index + i]); - if (!cluster_needs_cow(bs, l2_entry)) { + l2_entry =3D be64_to_cpu(l2_slice[l2_index + i]); + if (cluster_needs_cow(bs, l2_entry) !=3D want_cow) { break; } + if (!want_cow && expected_offset !=3D (l2_entry & L2E_OFFSET_MASK)= ) { + break; + } + expected_offset +=3D s->cluster_size; } =20 assert(i <=3D nb_clusters); @@ -1228,18 +1279,17 @@ static int handle_copied(BlockDriverState *bs, uint= 64_t guest_offset, =20 cluster_offset =3D be64_to_cpu(l2_slice[l2_index]); =20 - /* Check how many clusters are already allocated and don't need COW */ - if (qcow2_get_cluster_type(bs, cluster_offset) =3D=3D QCOW2_CLUSTER_NO= RMAL - && (cluster_offset & QCOW_OFLAG_COPIED)) - { + if (!cluster_needs_cow(bs, cluster_offset)) { /* If a specific host_offset is required, check it */ bool offset_matches =3D (cluster_offset & L2E_OFFSET_MASK) =3D=3D *host_offset; =20 if (offset_into_cluster(s, cluster_offset & L2E_OFFSET_MASK)) { - qcow2_signal_corruption(bs, true, -1, -1, "Data cluster offset= " + qcow2_signal_corruption(bs, true, -1, -1, "%s cluster offset " "%#llx unaligned (guest offset: %#" PR= Ix64 - ")", cluster_offset & L2E_OFFSET_MASK, + ")", cluster_offset & QCOW_OFLAG_ZERO ? + "Preallocated zero" : "Data", + cluster_offset & L2E_OFFSET_MASK, guest_offset); ret =3D -EIO; goto out; @@ -1252,15 +1302,17 @@ static int handle_copied(BlockDriverState *bs, uint= 64_t guest_offset, } =20 /* We keep all QCOW_OFLAG_COPIED clusters */ - keep_clusters =3D - count_contiguous_clusters(bs, nb_clusters, s->cluster_size, - &l2_slice[l2_index], - QCOW_OFLAG_COPIED | QCOW_OFLAG_ZERO); + keep_clusters =3D count_cow_clusters(bs, nb_clusters, l2_slice, + l2_index, false); assert(keep_clusters <=3D nb_clusters); =20 *bytes =3D MIN(*bytes, keep_clusters * s->cluster_size - offset_into_cluster(s, guest_offset)); + assert(*bytes !=3D 0); + + calculate_l2_meta(bs, cluster_offset & L2E_OFFSET_MASK, guest_offs= et, + *bytes, l2_slice, m, true); =20 ret =3D 1; } else { @@ -1361,12 +1413,10 @@ static int handle_alloc(BlockDriverState *bs, uint6= 4_t guest_offset, BDRVQcow2State *s =3D bs->opaque; int l2_index; uint64_t *l2_slice; - uint64_t entry; uint64_t nb_clusters; int ret; - bool keep_old_clusters =3D false; =20 - uint64_t alloc_cluster_offset =3D INV_OFFSET; + uint64_t alloc_cluster_offset; =20 trace_qcow2_handle_alloc(qemu_coroutine_self(), guest_offset, *host_of= fset, *bytes); @@ -1389,67 +1439,31 @@ static int handle_alloc(BlockDriverState *bs, uint6= 4_t guest_offset, return ret; } =20 - entry =3D be64_to_cpu(l2_slice[l2_index]); - nb_clusters =3D count_cow_clusters(bs, nb_clusters, l2_slice, l2_index= ); + nb_clusters =3D count_cow_clusters(bs, nb_clusters, l2_slice, l2_index= , true); =20 /* This function is only called when there were no non-COW clusters, s= o if * we can't find any unallocated or COW clusters either, something is * wrong with our code. */ assert(nb_clusters > 0); =20 - if (qcow2_get_cluster_type(bs, entry) =3D=3D QCOW2_CLUSTER_ZERO_ALLOC = && - (entry & QCOW_OFLAG_COPIED) && - (*host_offset =3D=3D INV_OFFSET || - start_of_cluster(s, *host_offset) =3D=3D (entry & L2E_OFFSET_MASK= ))) - { - int preallocated_nb_clusters; - - if (offset_into_cluster(s, entry & L2E_OFFSET_MASK)) { - qcow2_signal_corruption(bs, true, -1, -1, "Preallocated zero " - "cluster offset %#llx unaligned (guest= " - "offset: %#" PRIx64 ")", - entry & L2E_OFFSET_MASK, guest_offset); - ret =3D -EIO; - goto fail; - } - - /* Try to reuse preallocated zero clusters; contiguous normal clus= ters - * would be fine, too, but count_cow_clusters() above has limited - * nb_clusters already to a range of COW clusters */ - preallocated_nb_clusters =3D - count_contiguous_clusters(bs, nb_clusters, s->cluster_size, - &l2_slice[l2_index], QCOW_OFLAG_COPI= ED); - assert(preallocated_nb_clusters > 0); - - nb_clusters =3D preallocated_nb_clusters; - alloc_cluster_offset =3D entry & L2E_OFFSET_MASK; - - /* We want to reuse these clusters, so qcow2_alloc_cluster_link_l2= () - * should not free them. */ - keep_old_clusters =3D true; + /* Allocate, if necessary at a given offset in the image file */ + alloc_cluster_offset =3D *host_offset =3D=3D INV_OFFSET ? INV_OFFSET : + start_of_cluster(s, *host_offset); + ret =3D do_alloc_cluster_offset(bs, guest_offset, &alloc_cluster_offse= t, + &nb_clusters); + if (ret < 0) { + goto out; } =20 - qcow2_cache_put(s->l2_table_cache, (void **) &l2_slice); - - if (alloc_cluster_offset =3D=3D INV_OFFSET) { - /* Allocate, if necessary at a given offset in the image file */ - alloc_cluster_offset =3D *host_offset =3D=3D INV_OFFSET ? INV_OFFS= ET : - start_of_cluster(s, *host_offset); - ret =3D do_alloc_cluster_offset(bs, guest_offset, &alloc_cluster_o= ffset, - &nb_clusters); - if (ret < 0) { - goto fail; - } - - /* Can't extend contiguous allocation */ - if (nb_clusters =3D=3D 0) { - *bytes =3D 0; - return 0; - } - - assert(alloc_cluster_offset !=3D INV_OFFSET); + /* Can't extend contiguous allocation */ + if (nb_clusters =3D=3D 0) { + *bytes =3D 0; + ret =3D 0; + goto out; } =20 + assert(alloc_cluster_offset !=3D INV_OFFSET); + /* * Save info needed for meta data update. * @@ -1472,13 +1486,14 @@ static int handle_alloc(BlockDriverState *bs, uint6= 4_t guest_offset, *bytes =3D MIN(*bytes, nb_bytes - offset_into_cluster(s, guest_offset)= ); assert(*bytes !=3D 0); =20 - calculate_l2_meta(bs, alloc_cluster_offset, guest_offset, *bytes, - m, keep_old_clusters); + calculate_l2_meta(bs, alloc_cluster_offset, guest_offset, *bytes, l2_s= lice, + m, false); =20 - return 1; + ret =3D 1; =20 -fail: - if (*m && (*m)->nb_clusters > 0) { +out: + qcow2_cache_put(s->l2_table_cache, (void **) &l2_slice); + if (ret < 0 && *m && (*m)->nb_clusters > 0) { QLIST_REMOVE(*m, next_in_flight); } return ret; --=20 2.20.1