From nobody Sat May 4 21:26:54 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=virtuozzo.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 15547408713311017.9481419442013; Mon, 8 Apr 2019 09:27:51 -0700 (PDT) Received: from localhost ([127.0.0.1]:55892 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hDX7a-0001mg-CI for importer@patchew.org; Mon, 08 Apr 2019 12:27:46 -0400 Received: from eggs.gnu.org ([209.51.188.92]:59020) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hDX6I-00015h-Pu for qemu-devel@nongnu.org; Mon, 08 Apr 2019 12:26:27 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hDX6G-0006Ly-55 for qemu-devel@nongnu.org; Mon, 08 Apr 2019 12:26:26 -0400 Received: from relay.sw.ru ([185.231.240.75]:50300) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hDX6D-0006EU-4K; Mon, 08 Apr 2019 12:26:23 -0400 Received: from [10.28.8.145] (helo=kvm.sw.ru) by relay.sw.ru with esmtp (Exim 4.91) (envelope-from ) id 1hDX69-0001tt-Ii; Mon, 08 Apr 2019 19:26:17 +0300 From: Vladimir Sementsov-Ogievskiy To: qemu-devel@nongnu.org, qemu-block@nongnu.org Date: Mon, 8 Apr 2019 19:26:16 +0300 Message-Id: <20190408162617.258535-2-vsementsov@virtuozzo.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20190408162617.258535-1-vsementsov@virtuozzo.com> References: <20190408162617.258535-1-vsementsov@virtuozzo.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 185.231.240.75 Subject: [Qemu-devel] [PATCH v2 1/2] tests/perf: Test lseek influence on qcow2 block-status X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kwolf@redhat.com, fam@euphon.net, vsementsov@virtuozzo.com, mreitz@redhat.com, stefanha@redhat.com, den@openvz.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Block layer may recursively check block_status in file child of qcow2, if qcow2 driver returned DATA. There are several test cases to check influence of lseek on block_status performance. To see real difference run on tmpfs. Tests originally created by Kevin, I just refactored and put them together into one executable file with simple output. Signed-off-by: Vladimir Sementsov-Ogievskiy --- tests/perf/block/qcow2/convert-blockstatus | 71 ++++++++++++++++++++++ 1 file changed, 71 insertions(+) create mode 100755 tests/perf/block/qcow2/convert-blockstatus diff --git a/tests/perf/block/qcow2/convert-blockstatus b/tests/perf/block/= qcow2/convert-blockstatus new file mode 100755 index 0000000000..a1a3c1ef43 --- /dev/null +++ b/tests/perf/block/qcow2/convert-blockstatus @@ -0,0 +1,71 @@ +#!/bin/bash +# +# Test lseek influence on qcow2 block-status +# +# Block layer may recursively check block_status in file child of qcow2, if +# qcow2 driver returned DATA. There are several test cases to check influe= nce +# of lseek on block_status performance. To see real difference run on tmpf= s. +# +# Copyright (c) 2019 Virtuozzo International GmbH. All rights reserved. +# +# Tests originally written by Kevin Wolf +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see . +# + +if [ "$#" -lt 1 ]; then + echo "Usage: $0 SOURCE_FILE" + exit 1 +fi + +ROOT_DIR=3D"$( cd "$( dirname "${BASH_SOURCE[0]}" )/../../../.." >/dev/nul= l 2>&1 && pwd )" +QEMU_IMG=3D"$ROOT_DIR/qemu-img" +QEMU_IO=3D"$ROOT_DIR/qemu-io" + +size=3D1G +src=3D"$1" + +# test-case plain + +( +$QEMU_IMG create -f qcow2 "$src" $size +for i in $(seq 16384 -1 0); do + echo "write $((i * 65536)) 64k" +done | $QEMU_IO "$src" +) > /dev/null + +echo -n "plain: " +/usr/bin/time -f %e $QEMU_IMG convert -n "$src" null-co:// + +# test-case forward + +( +$QEMU_IMG create -f qcow2 "$src" $size +for i in $(seq 0 2 16384); do + echo "write $((i * 65536)) 64k" +done | $QEMU_IO "$src" +for i in $(seq 1 2 16384); do + echo "write $((i * 65536)) 64k" +done | $QEMU_IO "$src" +) > /dev/null + +echo -n "forward: " +/usr/bin/time -f %e $QEMU_IMG convert -n "$src" null-co:// + +# test-case prealloc + +$QEMU_IMG create -f qcow2 -o preallocation=3Dmetadata "$src" $size > /dev/= null + +echo -n "prealloc: " +/usr/bin/time -f %e $QEMU_IMG convert -n "$src" null-co:// --=20 2.18.0 From nobody Sat May 4 21:26:54 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=virtuozzo.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1554740966878373.7254193226149; Mon, 8 Apr 2019 09:29:26 -0700 (PDT) Received: from localhost ([127.0.0.1]:55910 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hDX97-0002oB-OG for importer@patchew.org; Mon, 08 Apr 2019 12:29:21 -0400 Received: from eggs.gnu.org ([209.51.188.92]:59019) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hDX6I-00015g-Pl for qemu-devel@nongnu.org; Mon, 08 Apr 2019 12:26:28 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hDX6G-0006M6-5T for qemu-devel@nongnu.org; Mon, 08 Apr 2019 12:26:26 -0400 Received: from relay.sw.ru ([185.231.240.75]:50308) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hDX6D-0006ES-4J; Mon, 08 Apr 2019 12:26:23 -0400 Received: from [10.28.8.145] (helo=kvm.sw.ru) by relay.sw.ru with esmtp (Exim 4.91) (envelope-from ) id 1hDX69-0001tt-N0; Mon, 08 Apr 2019 19:26:17 +0300 From: Vladimir Sementsov-Ogievskiy To: qemu-devel@nongnu.org, qemu-block@nongnu.org Date: Mon, 8 Apr 2019 19:26:17 +0300 Message-Id: <20190408162617.258535-3-vsementsov@virtuozzo.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20190408162617.258535-1-vsementsov@virtuozzo.com> References: <20190408162617.258535-1-vsementsov@virtuozzo.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 185.231.240.75 Subject: [Qemu-devel] [PATCH v2 2/2] block: avoid recursive block_status call if possible X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kwolf@redhat.com, fam@euphon.net, vsementsov@virtuozzo.com, mreitz@redhat.com, stefanha@redhat.com, den@openvz.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" drv_co_block_status digs bs->file for additional, more accurate search for hole inside region, reported as DATA by bs since 5daa74a6ebc. This accuracy is not free: assume we have qcow2 disk. Actually, qcow2 knows, where are holes and where is data. But every block_status request calls lseek additionally. Assume a big disk, full of data, in any iterative copying block job (or img convert) we'll call lseek(HOLE) on every iteration, and each of these lseeks will have to iterate through all metadata up to the end of file. It's obviously ineffective behavior. And for many scenarios we don't need this lseek at all. However, lseek is needed when we have metadata-preallocated image. So, let's detect metadata-preallocation case and don't dig qcow2's protocol file in other cases. The idea is to compare allocation size in POV of filesystem with allocations size in POV of Qcow2 (by refcounts). If allocation in fs is significantly lower, consider it as metadata-preallocation case. 102 iotest changed, as our detector can't detect shrinked file as metadata-preallocation, which don't seem to be wrong, as with metadata preallocation we always have valid file length. Other two iotests tiny changed QMP output sequence, which should be exactly because skipped lseek at mirror beginning. Suggested-by: Denis V. Lunev Signed-off-by: Vladimir Sementsov-Ogievskiy --- block/qcow2.h | 4 ++++ include/block/block.h | 8 +++++++- block/io.c | 9 ++++++++- block/qcow2-refcount.c | 32 ++++++++++++++++++++++++++++++++ block/qcow2.c | 11 +++++++++++ tests/qemu-iotests/102 | 2 +- tests/qemu-iotests/102.out | 3 ++- tests/qemu-iotests/141.out | 2 +- tests/qemu-iotests/144.out | 2 +- 9 files changed, 67 insertions(+), 6 deletions(-) diff --git a/block/qcow2.h b/block/qcow2.h index fdee297f33..b6135d8271 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -350,6 +350,9 @@ typedef struct BDRVQcow2State { int nb_compress_threads; =20 BdrvChild *data_file; + + bool metadata_preallocation_checked; + bool metadata_preallocation; } BDRVQcow2State; =20 typedef struct Qcow2COWRegion { @@ -643,6 +646,7 @@ int qcow2_change_refcount_order(BlockDriverState *bs, i= nt refcount_order, void *cb_opaque, Error **errp); int qcow2_shrink_reftable(BlockDriverState *bs); int64_t qcow2_get_last_cluster(BlockDriverState *bs, int64_t size); +int qcow2_detect_metadata_preallocation(BlockDriverState *bs); =20 /* qcow2-cluster.c functions */ int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size, diff --git a/include/block/block.h b/include/block/block.h index c7a26199aa..1f2f08e4ee 100644 --- a/include/block/block.h +++ b/include/block/block.h @@ -156,10 +156,15 @@ typedef struct HDGeometry { * BDRV_BLOCK_EOF: the returned pnum covers through end of file for this * layer, set by block layer * - * Internal flag: + * Internal flags: * BDRV_BLOCK_RAW: for use by passthrough drivers, such as raw, to request * that the block layer recompute the answer from the retu= rned * BDS; must be accompanied by just BDRV_BLOCK_OFFSET_VALI= D. + * BDRV_BLOCK_RECURSE: request that the block layer will recursively searc= h for + * zeroes in file child of current block node inside + * returned region. Only valid together with both + * BDRV_BLOCK_DATA and BDRV_BLOCK_OFFSET_VALID. Should= not + * appear with BDRV_BLOCK_ZERO. * * If BDRV_BLOCK_OFFSET_VALID is set, the map parameter represents the * host offset within the returned BDS that is allocated for the @@ -184,6 +189,7 @@ typedef struct HDGeometry { #define BDRV_BLOCK_RAW 0x08 #define BDRV_BLOCK_ALLOCATED 0x10 #define BDRV_BLOCK_EOF 0x20 +#define BDRV_BLOCK_RECURSE 0x40 #define BDRV_BLOCK_OFFSET_MASK BDRV_SECTOR_MASK =20 typedef QSIMPLEQ_HEAD(BlockReopenQueue, BlockReopenQueueEntry) BlockReopen= Queue; diff --git a/block/io.c b/block/io.c index dfc153b8d8..8595d4b504 100644 --- a/block/io.c +++ b/block/io.c @@ -2121,6 +2121,12 @@ static int coroutine_fn bdrv_co_block_status(BlockDr= iverState *bs, */ assert(*pnum && QEMU_IS_ALIGNED(*pnum, align) && align > offset - aligned_offset); + if (ret & BDRV_BLOCK_RECURSE) { + assert(ret & BDRV_BLOCK_DATA); + assert(ret & BDRV_BLOCK_OFFSET_VALID); + assert(!(ret & BDRV_BLOCK_ZERO)); + } + *pnum -=3D offset - aligned_offset; if (*pnum > bytes) { *pnum =3D bytes; @@ -2151,7 +2157,8 @@ static int coroutine_fn bdrv_co_block_status(BlockDri= verState *bs, } } =20 - if (want_zero && local_file && local_file !=3D bs && + if (want_zero && ret & BDRV_BLOCK_RECURSE && + local_file && local_file !=3D bs && (ret & BDRV_BLOCK_DATA) && !(ret & BDRV_BLOCK_ZERO) && (ret & BDRV_BLOCK_OFFSET_VALID)) { int64_t file_pnum; diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c index e0fe322500..5786ba93e0 100644 --- a/block/qcow2-refcount.c +++ b/block/qcow2-refcount.c @@ -3422,3 +3422,35 @@ int64_t qcow2_get_last_cluster(BlockDriverState *bs,= int64_t size) "There are no references in the refcount table= ."); return -EIO; } + +int qcow2_detect_metadata_preallocation(BlockDriverState *bs) +{ + BDRVQcow2State *s =3D bs->opaque; + int64_t i, end_cluster, cluster_count =3D 0, threshold; + int64_t file_length, real_allocation, real_clusters; + + file_length =3D bdrv_getlength(bs->file->bs); + if (file_length < 0) { + return file_length; + } + + real_allocation =3D bdrv_get_allocated_file_size(bs->file->bs); + if (real_allocation < 0) { + return real_allocation; + } + + real_clusters =3D real_allocation / s->cluster_size; + threshold =3D MAX(real_clusters * 10 / 9, real_clusters + 2); + + end_cluster =3D size_to_clusters(s, file_length); + for (i =3D 0; i < end_cluster && cluster_count < threshold; i++) { + uint64_t refcount; + int ret =3D qcow2_get_refcount(bs, i, &refcount); + if (ret < 0) { + return ret; + } + cluster_count +=3D !!refcount; + } + + return cluster_count >=3D threshold; +} diff --git a/block/qcow2.c b/block/qcow2.c index d507ee0686..e89e1c4fb1 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -1900,6 +1900,12 @@ static int coroutine_fn qcow2_co_block_status(BlockD= riverState *bs, unsigned int bytes; int status =3D 0; =20 + if (!s->metadata_preallocation_checked) { + ret =3D qcow2_detect_metadata_preallocation(bs); + s->metadata_preallocation =3D (ret =3D=3D 1); + s->metadata_preallocation_checked =3D true; + } + bytes =3D MIN(INT_MAX, count); qemu_co_mutex_lock(&s->lock); ret =3D qcow2_get_cluster_offset(bs, offset, &bytes, &cluster_offset); @@ -1922,6 +1928,11 @@ static int coroutine_fn qcow2_co_block_status(BlockD= riverState *bs, } else if (ret !=3D QCOW2_CLUSTER_UNALLOCATED) { status |=3D BDRV_BLOCK_DATA; } + if (s->metadata_preallocation && (status & BDRV_BLOCK_DATA) && + (status & BDRV_BLOCK_OFFSET_VALID)) + { + status |=3D BDRV_BLOCK_RECURSE; + } return status; } =20 diff --git a/tests/qemu-iotests/102 b/tests/qemu-iotests/102 index cedd2b25dc..125402db2b 100755 --- a/tests/qemu-iotests/102 +++ b/tests/qemu-iotests/102 @@ -56,7 +56,7 @@ $QEMU_IO -c 'write 0 64k' "$TEST_IMG" | _filter_qemu_io $QEMU_IMG resize -f raw --shrink "$TEST_IMG" $((5 * 64 * 1024)) =20 $QEMU_IO -c map "$TEST_IMG" -$QEMU_IMG map "$TEST_IMG" +$QEMU_IMG map "$TEST_IMG" | _filter_qemu_img_map =20 echo echo '=3D=3D=3D Testing map on an image file truncated outside of qemu =3D= =3D=3D' diff --git a/tests/qemu-iotests/102.out b/tests/qemu-iotests/102.out index 4401b08fee..cd2fdc7f96 100644 --- a/tests/qemu-iotests/102.out +++ b/tests/qemu-iotests/102.out @@ -7,7 +7,8 @@ wrote 65536/65536 bytes at offset 0 64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) Image resized. 64 KiB (0x10000) bytes allocated at offset 0 bytes (0x0) -Offset Length Mapped to File +Offset Length File +0 0x10000 TEST_DIR/t.IMGFMT =20 =3D=3D=3D Testing map on an image file truncated outside of qemu =3D=3D=3D =20 diff --git a/tests/qemu-iotests/141.out b/tests/qemu-iotests/141.out index 41c7291258..4d71d9dcae 100644 --- a/tests/qemu-iotests/141.out +++ b/tests/qemu-iotests/141.out @@ -42,9 +42,9 @@ Formatting 'TEST_DIR/o.IMGFMT', fmt=3DIMGFMT size=3D10485= 76 backing_file=3DTEST_DIR/t. {"return": {}} {"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event"= : "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "job0"}} {"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event"= : "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "job0"}} +{"return": {}} {"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event"= : "JOB_STATUS_CHANGE", "data": {"status": "ready", "id": "job0"}} {"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event"= : "BLOCK_JOB_READY", "data": {"device": "job0", "len": 0, "offset": 0, "spe= ed": 0, "type": "commit"}} -{"return": {}} {"error": {"class": "GenericError", "desc": "Node 'drv0' is busy: block de= vice is in use by block job: commit"}} {"return": {}} {"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event"= : "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "job0"}} diff --git a/tests/qemu-iotests/144.out b/tests/qemu-iotests/144.out index 55299201e4..a9a8216bea 100644 --- a/tests/qemu-iotests/144.out +++ b/tests/qemu-iotests/144.out @@ -14,10 +14,10 @@ Formatting 'TEST_DIR/tmp.qcow2', fmt=3Dqcow2 size=3D536= 870912 backing_file=3DTEST_DIR/ =20 {"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event"= : "JOB_STATUS_CHANGE", "data": {"status": "created", "id": "virtio0"}} {"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event"= : "JOB_STATUS_CHANGE", "data": {"status": "running", "id": "virtio0"}} +{"return": {}} {"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event"= : "JOB_STATUS_CHANGE", "data": {"status": "ready", "id": "virtio0"}} {"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event"= : "BLOCK_JOB_READY", "data": {"device": "virtio0", "len": 0, "offset": 0, "= speed": 0, "type": "commit"}} {"return": {}} -{"return": {}} {"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event"= : "JOB_STATUS_CHANGE", "data": {"status": "waiting", "id": "virtio0"}} {"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event"= : "JOB_STATUS_CHANGE", "data": {"status": "pending", "id": "virtio0"}} {"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event"= : "BLOCK_JOB_COMPLETED", "data": {"device": "virtio0", "len": 0, "offset": = 0, "speed": 0, "type": "commit"}} --=20 2.18.0