From nobody Tue Nov 11 22:37:52 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=virtuozzo.com ARC-Seal: i=1; a=rsa-sha256; t=1565969670; cv=none; d=zoho.com; s=zohoarc; b=Wl24Saq+L2j7sxymNGDGn3BO+js5m3H1qy4YjyZuY1X/ND9NoNCJJbvZL2uL0cHsuvcUbzSNPG9q9gtPNhppVH/c0rEm24eW1wQJrbHyYQP/XC0qevlbgQAKTeXrQ4KdMfCI44M6MA/stjBdBc0soQI9arozBtp4S2bp1T6BDQg= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1565969670; h=Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:Message-ID:References:Sender:Subject:To:ARC-Authentication-Results; bh=iyj0tWYBEd5u2E9PJnLEFav5awVb5oSac7+3a2JMMis=; b=mm0ptlu7LVzjVbyh6K7KajkMjQBd+oj6EVp6DYj1+0j3Hj4cz2BVixlAmHJ1Vi8CLUldSPXvKZhd6Ku1Y1Ne3GD/XtB6pn4GkYJWJLyp9agMTJ9qBlz1yp+BRzHv/Vn+wmEDnNIqzDYqhx5x6F7d0RvNHrWW07UtRqjpvq1ceGA= ARC-Authentication-Results: i=1; mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail header.from= (p=none dis=none) header.from= Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1565969670879580.0094799151751; Fri, 16 Aug 2019 08:34:30 -0700 (PDT) Received: from localhost ([::1]:57792 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1hyeFI-0003PY-Rr for importer@patchew.org; Fri, 16 Aug 2019 11:34:28 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:58081) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1hyeBO-0007oW-Vg for qemu-devel@nongnu.org; Fri, 16 Aug 2019 11:30:29 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hyeBM-0001b0-Et for qemu-devel@nongnu.org; Fri, 16 Aug 2019 11:30:26 -0400 Received: from relay.sw.ru ([185.231.240.75]:60444) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hyeBH-0001Qb-0u; Fri, 16 Aug 2019 11:30:19 -0400 Received: from [10.94.3.0] (helo=kvm.qa.sw.ru) by relay.sw.ru with esmtp (Exim 4.92) (envelope-from ) id 1hyeBD-0007x1-Vn; Fri, 16 Aug 2019 18:30:16 +0300 From: Vladimir Sementsov-Ogievskiy To: qemu-block@nongnu.org Date: Fri, 16 Aug 2019 18:30:15 +0300 Message-Id: <20190816153015.447957-6-vsementsov@virtuozzo.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20190816153015.447957-1-vsementsov@virtuozzo.com> References: <20190816153015.447957-1-vsementsov@virtuozzo.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 185.231.240.75 Subject: [Qemu-devel] [PATCH v4 5/5] block/qcow2: introduce parallel subrequest handling in read and write X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kwolf@redhat.com, vsementsov@virtuozzo.com, armbru@redhat.com, qemu-devel@nongnu.org, mreitz@redhat.com, stefanha@redhat.com, den@openvz.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" It improves performance for fragmented qcow2 images. It also affect 026 iotest, increasing leaked clusters number, which is not surprising when we run several sub-requests of qcow2 request in parallel. Signed-off-by: Vladimir Sementsov-Ogievskiy --- block/qcow2.h | 3 + block/qcow2.c | 125 ++++++++++++++++++++++++++--- block/trace-events | 1 + tests/qemu-iotests/026.out | 18 +++-- tests/qemu-iotests/026.out.nocache | 20 ++--- 5 files changed, 138 insertions(+), 29 deletions(-) diff --git a/block/qcow2.h b/block/qcow2.h index 998bcdaef1..fdfa9c31cd 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -65,6 +65,9 @@ #define QCOW2_MAX_BITMAPS 65535 #define QCOW2_MAX_BITMAP_DIRECTORY_SIZE (1024 * QCOW2_MAX_BITMAPS) =20 +/* Maximum of parallel sub-request per guest request */ +#define QCOW2_MAX_WORKERS 8 + /* indicate that the refcount of the referenced cluster is exactly one. */ #define QCOW_OFLAG_COPIED (1ULL << 63) /* indicate that the cluster is compressed (they never have the copied fla= g) */ diff --git a/block/qcow2.c b/block/qcow2.c index 3aaa180e2b..36b41e8536 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -40,6 +40,7 @@ #include "qapi/qobject-input-visitor.h" #include "qapi/qapi-visit-block-core.h" #include "crypto.h" +#include "block/aio_task.h" =20 /* Differences with QCOW: @@ -2017,6 +2018,60 @@ fail: return ret; } =20 +typedef struct Qcow2AioTask { + AioTask task; + + BlockDriverState *bs; + QCow2ClusterType cluster_type; /* only for read */ + uint64_t file_cluster_offset; + uint64_t offset; + uint64_t bytes; + QEMUIOVector *qiov; + uint64_t qiov_offset; + QCowL2Meta *l2meta; /* only for write */ +} Qcow2AioTask; + +static coroutine_fn int qcow2_co_preadv_task_entry(AioTask *task); +static coroutine_fn int qcow2_add_task(BlockDriverState *bs, + AioTaskPool *pool, + AioTaskFunc func, + QCow2ClusterType cluster_type, + uint64_t file_cluster_offset, + uint64_t offset, + uint64_t bytes, + QEMUIOVector *qiov, + size_t qiov_offset, + QCowL2Meta *l2meta) +{ + Qcow2AioTask local_task; + Qcow2AioTask *task =3D pool ? g_new(Qcow2AioTask, 1) : &local_task; + + *task =3D (Qcow2AioTask) { + .task.func =3D func, + .bs =3D bs, + .cluster_type =3D cluster_type, + .qiov =3D qiov, + .file_cluster_offset =3D file_cluster_offset, + .offset =3D offset, + .bytes =3D bytes, + .qiov_offset =3D qiov_offset, + .l2meta =3D l2meta, + }; + + trace_qcow2_add_task(qemu_coroutine_self(), bs, pool, + func =3D=3D qcow2_co_preadv_task_entry ? "read" := "write", + cluster_type, file_cluster_offset, offset, bytes, + qiov, qiov_offset); + + if (!pool) { + return func(&task->task); + } + + aio_task_pool_start_task(pool, &task->task); + + return 0; +} + static coroutine_fn int qcow2_co_preadv_task(BlockDriverState *bs, QCow2ClusterType cluster_type, uint64_t file_cluster_offset, @@ -2066,18 +2121,28 @@ static coroutine_fn int qcow2_co_preadv_task(BlockD= riverState *bs, g_assert_not_reached(); } =20 +static coroutine_fn int qcow2_co_preadv_task_entry(AioTask *task) +{ + Qcow2AioTask *t =3D container_of(task, Qcow2AioTask, task); + + assert(!t->l2meta); + + return qcow2_co_preadv_task(t->bs, t->cluster_type, t->file_cluster_of= fset, + t->offset, t->bytes, t->qiov, t->qiov_offs= et); +} + static coroutine_fn int qcow2_co_preadv_part(BlockDriverState *bs, uint64_t offset, uint64_t byt= es, QEMUIOVector *qiov, size_t qiov_offset, int flags) { BDRVQcow2State *s =3D bs->opaque; - int ret; + int ret =3D 0; unsigned int cur_bytes; /* number of bytes in current iteration */ uint64_t cluster_offset =3D 0; + AioTaskPool *aio =3D NULL; =20 - while (bytes !=3D 0) { - + while (bytes !=3D 0 && aio_task_pool_status(aio) =3D=3D 0) { /* prepare next request */ cur_bytes =3D MIN(bytes, INT_MAX); if (s->crypto) { @@ -2089,7 +2154,7 @@ static coroutine_fn int qcow2_co_preadv_part(BlockDri= verState *bs, ret =3D qcow2_get_cluster_offset(bs, offset, &cur_bytes, &cluster_= offset); qemu_co_mutex_unlock(&s->lock); if (ret < 0) { - return ret; + goto out; } =20 if (ret =3D=3D QCOW2_CLUSTER_ZERO_PLAIN || @@ -2098,11 +2163,14 @@ static coroutine_fn int qcow2_co_preadv_part(BlockD= riverState *bs, { qemu_iovec_memset(qiov, qiov_offset, 0, cur_bytes); } else { - ret =3D qcow2_co_preadv_task(bs, ret, - cluster_offset, offset, cur_bytes, - qiov, qiov_offset); + if (!aio && cur_bytes !=3D bytes) { + aio =3D aio_task_pool_new(QCOW2_MAX_WORKERS); + } + ret =3D qcow2_add_task(bs, aio, qcow2_co_preadv_task_entry, re= t, + cluster_offset, offset, cur_bytes, + qiov, qiov_offset, NULL); if (ret < 0) { - return ret; + goto out; } } =20 @@ -2111,7 +2179,16 @@ static coroutine_fn int qcow2_co_preadv_part(BlockDr= iverState *bs, qiov_offset +=3D cur_bytes; } =20 - return 0; +out: + if (aio) { + aio_task_pool_wait_all(aio); + if (ret =3D=3D 0) { + ret =3D aio_task_pool_status(aio); + } + g_free(aio); + } + + return ret; } =20 /* Check if it's possible to merge a write request with the writing of @@ -2315,6 +2392,17 @@ out_locked: return ret; } =20 +static coroutine_fn int qcow2_co_pwritev_task_entry(AioTask *task) +{ + Qcow2AioTask *t =3D container_of(task, Qcow2AioTask, task); + + assert(!t->cluster_type); + + return qcow2_co_pwritev_task(t->bs, t->file_cluster_offset, + t->offset, t->bytes, t->qiov, t->qiov_off= set, + t->l2meta); +} + static coroutine_fn int qcow2_co_pwritev_part( BlockDriverState *bs, uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, size_t qiov_offset, int flags) @@ -2325,10 +2413,11 @@ static coroutine_fn int qcow2_co_pwritev_part( unsigned int cur_bytes; /* number of sectors in current iteration */ uint64_t cluster_offset; QCowL2Meta *l2meta =3D NULL; + AioTaskPool *aio =3D NULL; =20 trace_qcow2_writev_start_req(qemu_coroutine_self(), offset, bytes); =20 - while (bytes !=3D 0) { + while (bytes !=3D 0 && aio_task_pool_status(aio) =3D=3D 0) { =20 l2meta =3D NULL; =20 @@ -2360,8 +2449,12 @@ static coroutine_fn int qcow2_co_pwritev_part( =20 qemu_co_mutex_unlock(&s->lock); =20 - ret =3D qcow2_co_pwritev_task(bs, cluster_offset, offset, cur_byte= s, - qiov, qiov_offset, l2meta); + if (!aio && cur_bytes !=3D bytes) { + aio =3D aio_task_pool_new(QCOW2_MAX_WORKERS); + } + ret =3D qcow2_add_task(bs, aio, qcow2_co_pwritev_task_entry, 0, + cluster_offset, offset, cur_bytes, + qiov, qiov_offset, l2meta); l2meta =3D NULL; /* l2meta is consumed by qcow2_co_pwritev_task() = */ if (ret < 0) { goto fail_nometa; @@ -2382,6 +2475,14 @@ out_locked: qemu_co_mutex_unlock(&s->lock); =20 fail_nometa: + if (aio) { + aio_task_pool_wait_all(aio); + if (ret =3D=3D 0) { + ret =3D aio_task_pool_status(aio); + } + g_free(aio); + } + trace_qcow2_writev_done_req(qemu_coroutine_self(), ret); =20 return ret; diff --git a/block/trace-events b/block/trace-events index d724df0117..7f51550ba3 100644 --- a/block/trace-events +++ b/block/trace-events @@ -61,6 +61,7 @@ file_paio_submit(void *acb, void *opaque, int64_t offset,= int count, int type) " file_copy_file_range(void *bs, int src, int64_t src_off, int dst, int64_t = dst_off, int64_t bytes, int flags, int64_t ret) "bs %p src_fd %d offset %"P= RIu64" dst_fd %d offset %"PRIu64" bytes %"PRIu64" flags %d ret %"PRId64 =20 # qcow2.c +qcow2_add_task(void *co, void *bs, void *pool, const char *action, int clu= ster_type, uint64_t file_cluster_offset, uint64_t offset, uint64_t bytes, v= oid *qiov, size_t qiov_offset) "co %p bs %p pool %p: %s: cluster_type %d fi= le_cluster_offset %" PRIu64 " offset %" PRIu64 " bytes %" PRIu64 " qiov %p = qiov_offset %zu" qcow2_writev_start_req(void *co, int64_t offset, int bytes) "co %p offset = 0x%" PRIx64 " bytes %d" qcow2_writev_done_req(void *co, int ret) "co %p ret %d" qcow2_writev_start_part(void *co) "co %p" diff --git a/tests/qemu-iotests/026.out b/tests/qemu-iotests/026.out index fb89b8480c..4849c9c90a 100644 --- a/tests/qemu-iotests/026.out +++ b/tests/qemu-iotests/026.out @@ -481,7 +481,7 @@ qemu-io: Failed to flush the L2 table cache: No space l= eft on device qemu-io: Failed to flush the refcount block cache: No space left on device write failed: No space left on device =20 -55 leaked clusters were found on the image. +119 leaked clusters were found on the image. This means waste of disk space, but no harm to data. Formatting 'TEST_DIR/t.IMGFMT', fmt=3DIMGFMT size=3D1073741824 =20 @@ -508,7 +508,9 @@ Event: refblock_alloc_write; errno: 28; imm: off; once:= off; write qemu-io: Failed to flush the L2 table cache: No space left on device qemu-io: Failed to flush the refcount block cache: No space left on device write failed: No space left on device -No errors were found on the image. + +64 leaked clusters were found on the image. +This means waste of disk space, but no harm to data. Formatting 'TEST_DIR/t.IMGFMT', fmt=3DIMGFMT size=3D1073741824 =20 Event: refblock_alloc_write; errno: 28; imm: off; once: off; write -b @@ -533,7 +535,7 @@ qemu-io: Failed to flush the L2 table cache: No space l= eft on device qemu-io: Failed to flush the refcount block cache: No space left on device write failed: No space left on device =20 -10 leaked clusters were found on the image. +74 leaked clusters were found on the image. This means waste of disk space, but no harm to data. Formatting 'TEST_DIR/t.IMGFMT', fmt=3DIMGFMT size=3D1073741824 =20 @@ -542,7 +544,7 @@ qemu-io: Failed to flush the L2 table cache: No space l= eft on device qemu-io: Failed to flush the refcount block cache: No space left on device write failed: No space left on device =20 -23 leaked clusters were found on the image. +87 leaked clusters were found on the image. This means waste of disk space, but no harm to data. Formatting 'TEST_DIR/t.IMGFMT', fmt=3DIMGFMT size=3D1073741824 =20 @@ -561,7 +563,7 @@ qemu-io: Failed to flush the L2 table cache: No space l= eft on device qemu-io: Failed to flush the refcount block cache: No space left on device write failed: No space left on device =20 -10 leaked clusters were found on the image. +74 leaked clusters were found on the image. This means waste of disk space, but no harm to data. Formatting 'TEST_DIR/t.IMGFMT', fmt=3DIMGFMT size=3D1073741824 =20 @@ -570,7 +572,7 @@ qemu-io: Failed to flush the L2 table cache: No space l= eft on device qemu-io: Failed to flush the refcount block cache: No space left on device write failed: No space left on device =20 -23 leaked clusters were found on the image. +87 leaked clusters were found on the image. This means waste of disk space, but no harm to data. Formatting 'TEST_DIR/t.IMGFMT', fmt=3DIMGFMT size=3D1073741824 =20 @@ -589,7 +591,7 @@ qemu-io: Failed to flush the L2 table cache: No space l= eft on device qemu-io: Failed to flush the refcount block cache: No space left on device write failed: No space left on device =20 -10 leaked clusters were found on the image. +74 leaked clusters were found on the image. This means waste of disk space, but no harm to data. Formatting 'TEST_DIR/t.IMGFMT', fmt=3DIMGFMT size=3D1073741824 =20 @@ -598,7 +600,7 @@ qemu-io: Failed to flush the L2 table cache: No space l= eft on device qemu-io: Failed to flush the refcount block cache: No space left on device write failed: No space left on device =20 -23 leaked clusters were found on the image. +87 leaked clusters were found on the image. This means waste of disk space, but no harm to data. =20 =3D=3D=3D L1 growth tests =3D=3D=3D diff --git a/tests/qemu-iotests/026.out.nocache b/tests/qemu-iotests/026.ou= t.nocache index 6dda95dfb4..6b56df7788 100644 --- a/tests/qemu-iotests/026.out.nocache +++ b/tests/qemu-iotests/026.out.nocache @@ -489,7 +489,7 @@ qemu-io: Failed to flush the L2 table cache: No space l= eft on device qemu-io: Failed to flush the refcount block cache: No space left on device write failed: No space left on device =20 -55 leaked clusters were found on the image. +119 leaked clusters were found on the image. This means waste of disk space, but no harm to data. Formatting 'TEST_DIR/t.IMGFMT', fmt=3DIMGFMT size=3D1073741824=20 =20 @@ -516,8 +516,10 @@ Event: refblock_alloc_write; errno: 28; imm: off; once= : off; write qemu-io: Failed to flush the L2 table cache: No space left on device qemu-io: Failed to flush the refcount block cache: No space left on device write failed: No space left on device -No errors were found on the image. -Formatting 'TEST_DIR/t.IMGFMT', fmt=3DIMGFMT size=3D1073741824=20 + +64 leaked clusters were found on the image. +This means waste of disk space, but no harm to data. +Formatting 'TEST_DIR/t.IMGFMT', fmt=3DIMGFMT size=3D1073741824 =20 Event: refblock_alloc_write; errno: 28; imm: off; once: off; write -b qemu-io: Failed to flush the L2 table cache: No space left on device @@ -541,7 +543,7 @@ qemu-io: Failed to flush the L2 table cache: No space l= eft on device qemu-io: Failed to flush the refcount block cache: No space left on device write failed: No space left on device =20 -10 leaked clusters were found on the image. +74 leaked clusters were found on the image. This means waste of disk space, but no harm to data. Formatting 'TEST_DIR/t.IMGFMT', fmt=3DIMGFMT size=3D1073741824=20 =20 @@ -550,7 +552,7 @@ qemu-io: Failed to flush the L2 table cache: No space l= eft on device qemu-io: Failed to flush the refcount block cache: No space left on device write failed: No space left on device =20 -23 leaked clusters were found on the image. +87 leaked clusters were found on the image. This means waste of disk space, but no harm to data. Formatting 'TEST_DIR/t.IMGFMT', fmt=3DIMGFMT size=3D1073741824=20 =20 @@ -569,7 +571,7 @@ qemu-io: Failed to flush the L2 table cache: No space l= eft on device qemu-io: Failed to flush the refcount block cache: No space left on device write failed: No space left on device =20 -10 leaked clusters were found on the image. +74 leaked clusters were found on the image. This means waste of disk space, but no harm to data. Formatting 'TEST_DIR/t.IMGFMT', fmt=3DIMGFMT size=3D1073741824=20 =20 @@ -578,7 +580,7 @@ qemu-io: Failed to flush the L2 table cache: No space l= eft on device qemu-io: Failed to flush the refcount block cache: No space left on device write failed: No space left on device =20 -23 leaked clusters were found on the image. +87 leaked clusters were found on the image. This means waste of disk space, but no harm to data. Formatting 'TEST_DIR/t.IMGFMT', fmt=3DIMGFMT size=3D1073741824=20 =20 @@ -597,7 +599,7 @@ qemu-io: Failed to flush the L2 table cache: No space l= eft on device qemu-io: Failed to flush the refcount block cache: No space left on device write failed: No space left on device =20 -10 leaked clusters were found on the image. +74 leaked clusters were found on the image. This means waste of disk space, but no harm to data. Formatting 'TEST_DIR/t.IMGFMT', fmt=3DIMGFMT size=3D1073741824=20 =20 @@ -606,7 +608,7 @@ qemu-io: Failed to flush the L2 table cache: No space l= eft on device qemu-io: Failed to flush the refcount block cache: No space left on device write failed: No space left on device =20 -23 leaked clusters were found on the image. +87 leaked clusters were found on the image. This means waste of disk space, but no harm to data. =20 =3D=3D=3D L1 growth tests =3D=3D=3D --=20 2.18.0