From nobody Wed Apr 9 00:06:59 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1598344843; cv=none; d=zohomail.com; s=zohoarc; b=UftlHWHJBK8bbw43MmNwmYYeCu3x3hZ+DSTCWH1i9019m9gCEFraw3NSrTxM3U4QEOdBc5CeTEZ+CZL5aAZqspJKtfxJaDkcvFUXpRgNjEdkt8Yh++wdKLNr/OF/I5Nha+Me1WguIeBrg1c9IcYU7hKHMD5K9U7e5JTK3JP0Kxo= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1598344843; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=U6JLJl5beNXDL3oJiIgYhKZh+mibMI7ZXPkrJZXoGyI=; b=d9jgJdFTIJSpKVoWO9fvTw79HxqgRSnLTEPPfi9c48K8w/B+TnPsWzY/iEo5df5Bqohzqpc4Tq9LqD25y5sPLBfezMCi1Q3vhinHJP97bRJ0gDGjq9cpLxv8ajj0hgSM99BkRo4vFrvTTfOZTPlo6Ov4NbzSfVrDNlt/GbyD0Vw= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail header.from= (p=none dis=none) header.from= Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 159834484361089.04232279176165; Tue, 25 Aug 2020 01:40:43 -0700 (PDT) Received: from localhost ([::1]:40330 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kAUVW-00088t-6J for importer@patchew.org; Tue, 25 Aug 2020 04:40:42 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:60478) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kAUPF-0004mR-3y for qemu-devel@nongnu.org; Tue, 25 Aug 2020 04:34:13 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:42937) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1kAUP6-0000fT-1g for qemu-devel@nongnu.org; Tue, 25 Aug 2020 04:34:11 -0400 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-493-jhmICFV3NguO0EesTebg-w-1; Tue, 25 Aug 2020 04:34:01 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 34EF510059B4; Tue, 25 Aug 2020 08:34:00 +0000 (UTC) Received: from localhost (ovpn-113-72.ams2.redhat.com [10.36.113.72]) by smtp.corp.redhat.com (Postfix) with ESMTPS id A048F5F707; Tue, 25 Aug 2020 08:33:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1598344443; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=U6JLJl5beNXDL3oJiIgYhKZh+mibMI7ZXPkrJZXoGyI=; b=Y2F7S/armQIK1WEP9svH8WyLLXf0t8AP8chwJx5bYcJ82pZ03Iu0LiD1V65+oBn2ySpaU5 Sp/DWgV893ZHL05i41AoDOayDxx5fN2xRg+ofKNfsOYrSemi9xOJNafjRwF8iOT+T/C4qD jX61IFeE/8x078Nl3z+LZMUCtoYlMTw= X-MC-Unique: jhmICFV3NguO0EesTebg-w-1 From: Max Reitz To: qemu-block@nongnu.org Subject: [PULL 20/34] qcow2: Add subcluster support to calculate_l2_meta() Date: Tue, 25 Aug 2020 10:32:57 +0200 Message-Id: <20200825083311.1098442-21-mreitz@redhat.com> In-Reply-To: <20200825083311.1098442-1-mreitz@redhat.com> References: <20200825083311.1098442-1-mreitz@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=mreitz@redhat.com X-Mimecast-Spam-Score: 0.001 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=216.205.24.124; envelope-from=mreitz@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/08/25 02:03:58 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -30 X-Spam_score: -3.1 X-Spam_bar: --- X-Spam_report: (-3.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.956, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Peter Maydell , qemu-devel@nongnu.org, Max Reitz Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) Content-Type: text/plain; charset="utf-8" From: Alberto Garcia If an image has subclusters then there are more copy-on-write scenarios that we need to consider. Let's say we have a write request from the middle of subcluster #3 until the end of the cluster: 1) If we are writing to a newly allocated cluster then we need copy-on-write. The previous contents of subclusters #0 to #3 must be copied to the new cluster. We can optimize this process by skipping all leading unallocated or zero subclusters (the status of those skipped subclusters will be reflected in the new L2 bitmap). 2) If we are overwriting an existing cluster: 2.1) If subcluster #3 is unallocated or has the all-zeroes bit set then we need copy-on-write (on subcluster #3 only). 2.2) If subcluster #3 was already allocated then there is no need for any copy-on-write. However we still need to update the L2 bitmap to reflect possible changes in the allocation status of subclusters #4 to #31. Because of this, this function checks if all the overwritten subclusters are already allocated and in this case it returns without creating a new QCowL2Meta structure. After all these changes l2meta_cow_start() and l2meta_cow_end() are not necessarily cluster-aligned anymore. We need to update the calculation of old_start and old_end in handle_dependencies() to guarantee that no two requests try to write on the same cluster. Signed-off-by: Alberto Garcia Reviewed-by: Eric Blake Reviewed-by: Max Reitz Message-Id: <4292dd56e4446d386a2fe307311737a711c00708.1594396418.git.berto@= igalia.com> Signed-off-by: Max Reitz --- block/qcow2-cluster.c | 167 +++++++++++++++++++++++++++++++++--------- 1 file changed, 133 insertions(+), 34 deletions(-) diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c index 5937937596..5e7ae0843d 100644 --- a/block/qcow2-cluster.c +++ b/block/qcow2-cluster.c @@ -387,7 +387,6 @@ fail: * If the L2 entry is invalid return -errno and set @type to * QCOW2_SUBCLUSTER_INVALID. */ -G_GNUC_UNUSED static int qcow2_get_subcluster_range_type(BlockDriverState *bs, uint64_t l2_entry, uint64_t l2_bitmap, @@ -1111,56 +1110,148 @@ void qcow2_alloc_cluster_abort(BlockDriverState *b= s, QCowL2Meta *m) * If @keep_old is true it means that the clusters were already * allocated and will be overwritten. If false then the clusters are * new and we have to decrease the reference count of the old ones. + * + * Returns 0 on success, -errno on failure. */ -static void calculate_l2_meta(BlockDriverState *bs, - uint64_t host_cluster_offset, - uint64_t guest_offset, unsigned bytes, - uint64_t *l2_slice, QCowL2Meta **m, bool kee= p_old) +static int calculate_l2_meta(BlockDriverState *bs, uint64_t host_cluster_o= ffset, + uint64_t guest_offset, unsigned bytes, + uint64_t *l2_slice, QCowL2Meta **m, bool keep= _old) { BDRVQcow2State *s =3D bs->opaque; - int l2_index =3D offset_to_l2_slice_index(s, guest_offset); - uint64_t l2_entry; + int sc_index, l2_index =3D offset_to_l2_slice_index(s, guest_offset); + uint64_t l2_entry, l2_bitmap; unsigned cow_start_from, cow_end_to; unsigned cow_start_to =3D offset_into_cluster(s, guest_offset); unsigned cow_end_from =3D cow_start_to + bytes; unsigned nb_clusters =3D size_to_clusters(s, cow_end_from); QCowL2Meta *old_m =3D *m; - QCow2ClusterType type; + QCow2SubclusterType type; + int i; + bool skip_cow =3D keep_old; =20 assert(nb_clusters <=3D s->l2_slice_size - l2_index); =20 - /* Return if there's no COW (all clusters are normal and we keep them)= */ - if (keep_old) { - int i; - for (i =3D 0; i < nb_clusters; i++) { - l2_entry =3D get_l2_entry(s, l2_slice, l2_index + i); - if (qcow2_get_cluster_type(bs, l2_entry) !=3D QCOW2_CLUSTER_NO= RMAL) { - break; + /* Check the type of all affected subclusters */ + for (i =3D 0; i < nb_clusters; i++) { + l2_entry =3D get_l2_entry(s, l2_slice, l2_index + i); + l2_bitmap =3D get_l2_bitmap(s, l2_slice, l2_index + i); + if (skip_cow) { + unsigned write_from =3D MAX(cow_start_to, i << s->cluster_bits= ); + unsigned write_to =3D MIN(cow_end_from, (i + 1) << s->cluster_= bits); + int first_sc =3D offset_to_sc_index(s, write_from); + int last_sc =3D offset_to_sc_index(s, write_to - 1); + int cnt =3D qcow2_get_subcluster_range_type(bs, l2_entry, l2_b= itmap, + first_sc, &type); + /* Is any of the subclusters of type !=3D QCOW2_SUBCLUSTER_NOR= MAL ? */ + if (type !=3D QCOW2_SUBCLUSTER_NORMAL || first_sc + cnt <=3D l= ast_sc) { + skip_cow =3D false; } + } else { + /* If we can't skip the cow we can still look for invalid entr= ies */ + type =3D qcow2_get_subcluster_type(bs, l2_entry, l2_bitmap, 0); } - if (i =3D=3D nb_clusters) { - return; + if (type =3D=3D QCOW2_SUBCLUSTER_INVALID) { + int l1_index =3D offset_to_l1_index(s, guest_offset); + uint64_t l2_offset =3D s->l1_table[l1_index] & L1E_OFFSET_MASK; + qcow2_signal_corruption(bs, true, -1, -1, "Invalid cluster " + "entry found (L2 offset: %#" PRIx64 + ", L2 index: %#x)", + l2_offset, l2_index + i); + return -EIO; } } =20 + if (skip_cow) { + return 0; + } + /* Get the L2 entry of the first cluster */ l2_entry =3D get_l2_entry(s, l2_slice, l2_index); - type =3D qcow2_get_cluster_type(bs, l2_entry); - - if (type =3D=3D QCOW2_CLUSTER_NORMAL && keep_old) { - cow_start_from =3D cow_start_to; + l2_bitmap =3D get_l2_bitmap(s, l2_slice, l2_index); + sc_index =3D offset_to_sc_index(s, guest_offset); + type =3D qcow2_get_subcluster_type(bs, l2_entry, l2_bitmap, sc_index); + + if (!keep_old) { + switch (type) { + case QCOW2_SUBCLUSTER_COMPRESSED: + cow_start_from =3D 0; + break; + case QCOW2_SUBCLUSTER_NORMAL: + case QCOW2_SUBCLUSTER_ZERO_ALLOC: + case QCOW2_SUBCLUSTER_UNALLOCATED_ALLOC: + if (has_subclusters(s)) { + /* Skip all leading zero and unallocated subclusters */ + uint32_t alloc_bitmap =3D l2_bitmap & QCOW_L2_BITMAP_ALL_A= LLOC; + cow_start_from =3D + MIN(sc_index, ctz32(alloc_bitmap)) << s->subcluster_bi= ts; + } else { + cow_start_from =3D 0; + } + break; + case QCOW2_SUBCLUSTER_ZERO_PLAIN: + case QCOW2_SUBCLUSTER_UNALLOCATED_PLAIN: + cow_start_from =3D sc_index << s->subcluster_bits; + break; + default: + g_assert_not_reached(); + } } else { - cow_start_from =3D 0; + switch (type) { + case QCOW2_SUBCLUSTER_NORMAL: + cow_start_from =3D cow_start_to; + break; + case QCOW2_SUBCLUSTER_ZERO_ALLOC: + case QCOW2_SUBCLUSTER_UNALLOCATED_ALLOC: + cow_start_from =3D sc_index << s->subcluster_bits; + break; + default: + g_assert_not_reached(); + } } =20 /* Get the L2 entry of the last cluster */ - l2_entry =3D get_l2_entry(s, l2_slice, l2_index + nb_clusters - 1); - type =3D qcow2_get_cluster_type(bs, l2_entry); - - if (type =3D=3D QCOW2_CLUSTER_NORMAL && keep_old) { - cow_end_to =3D cow_end_from; + l2_index +=3D nb_clusters - 1; + l2_entry =3D get_l2_entry(s, l2_slice, l2_index); + l2_bitmap =3D get_l2_bitmap(s, l2_slice, l2_index); + sc_index =3D offset_to_sc_index(s, guest_offset + bytes - 1); + type =3D qcow2_get_subcluster_type(bs, l2_entry, l2_bitmap, sc_index); + + if (!keep_old) { + switch (type) { + case QCOW2_SUBCLUSTER_COMPRESSED: + cow_end_to =3D ROUND_UP(cow_end_from, s->cluster_size); + break; + case QCOW2_SUBCLUSTER_NORMAL: + case QCOW2_SUBCLUSTER_ZERO_ALLOC: + case QCOW2_SUBCLUSTER_UNALLOCATED_ALLOC: + cow_end_to =3D ROUND_UP(cow_end_from, s->cluster_size); + if (has_subclusters(s)) { + /* Skip all trailing zero and unallocated subclusters */ + uint32_t alloc_bitmap =3D l2_bitmap & QCOW_L2_BITMAP_ALL_A= LLOC; + cow_end_to -=3D + MIN(s->subclusters_per_cluster - sc_index - 1, + clz32(alloc_bitmap)) << s->subcluster_bits; + } + break; + case QCOW2_SUBCLUSTER_ZERO_PLAIN: + case QCOW2_SUBCLUSTER_UNALLOCATED_PLAIN: + cow_end_to =3D ROUND_UP(cow_end_from, s->subcluster_size); + break; + default: + g_assert_not_reached(); + } } else { - cow_end_to =3D ROUND_UP(cow_end_from, s->cluster_size); + switch (type) { + case QCOW2_SUBCLUSTER_NORMAL: + cow_end_to =3D cow_end_from; + break; + case QCOW2_SUBCLUSTER_ZERO_ALLOC: + case QCOW2_SUBCLUSTER_UNALLOCATED_ALLOC: + cow_end_to =3D ROUND_UP(cow_end_from, s->subcluster_size); + break; + default: + g_assert_not_reached(); + } } =20 *m =3D g_malloc0(sizeof(**m)); @@ -1185,6 +1276,8 @@ static void calculate_l2_meta(BlockDriverState *bs, =20 qemu_co_queue_init(&(*m)->dependent_requests); QLIST_INSERT_HEAD(&s->cluster_allocs, *m, next_in_flight); + + return 0; } =20 /* @@ -1273,8 +1366,8 @@ static int handle_dependencies(BlockDriverState *bs, = uint64_t guest_offset, =20 uint64_t start =3D guest_offset; uint64_t end =3D start + bytes; - uint64_t old_start =3D l2meta_cow_start(old_alloc); - uint64_t old_end =3D l2meta_cow_end(old_alloc); + uint64_t old_start =3D start_of_cluster(s, l2meta_cow_start(old_al= loc)); + uint64_t old_end =3D ROUND_UP(l2meta_cow_end(old_alloc), s->cluste= r_size); =20 if (end <=3D old_start || start >=3D old_end) { /* No intersection */ @@ -1399,8 +1492,11 @@ static int handle_copied(BlockDriverState *bs, uint6= 4_t guest_offset, - offset_into_cluster(s, guest_offset)); assert(*bytes !=3D 0); =20 - calculate_l2_meta(bs, cluster_offset, guest_offset, - *bytes, l2_slice, m, true); + ret =3D calculate_l2_meta(bs, cluster_offset, guest_offset, + *bytes, l2_slice, m, true); + if (ret < 0) { + goto out; + } =20 ret =3D 1; } else { @@ -1576,8 +1672,11 @@ static int handle_alloc(BlockDriverState *bs, uint64= _t guest_offset, *bytes =3D MIN(*bytes, nb_bytes - offset_into_cluster(s, guest_offset)= ); assert(*bytes !=3D 0); =20 - calculate_l2_meta(bs, alloc_cluster_offset, guest_offset, *bytes, l2_s= lice, - m, false); + ret =3D calculate_l2_meta(bs, alloc_cluster_offset, guest_offset, *byt= es, + l2_slice, m, false); + if (ret < 0) { + goto out; + } =20 ret =3D 1; =20 --=20 2.26.2