From nobody Tue Nov 4 23:50:10 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 1531413655115678.5769695206143; Thu, 12 Jul 2018 09:40:55 -0700 (PDT) Received: from localhost ([::1]:33036 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fdee2-0002cZ-TH for importer@patchew.org; Thu, 12 Jul 2018 12:40:42 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:56817) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fdeVq-0004jw-1y for qemu-devel@nongnu.org; Thu, 12 Jul 2018 12:32:15 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fdeVo-00069Q-HO for qemu-devel@nongnu.org; Thu, 12 Jul 2018 12:32:14 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:47846 helo=mx1.redhat.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fdeVh-00065z-W8; Thu, 12 Jul 2018 12:32:06 -0400 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 9271C40122CD; Thu, 12 Jul 2018 16:32:05 +0000 (UTC) Received: from localhost.localdomain.com (ovpn-117-16.ams2.redhat.com [10.36.117.16]) by smtp.corp.redhat.com (Postfix) with ESMTP id B9A472026D76; Thu, 12 Jul 2018 16:32:04 +0000 (UTC) From: Kevin Wolf To: qemu-block@nongnu.org Date: Thu, 12 Jul 2018 18:31:52 +0200 Message-Id: <20180712163152.12521-8-kwolf@redhat.com> In-Reply-To: <20180712163152.12521-1-kwolf@redhat.com> References: <20180712163152.12521-1-kwolf@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.4 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.5]); Thu, 12 Jul 2018 16:32:05 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.5]); Thu, 12 Jul 2018 16:32:05 +0000 (UTC) for IP:'10.11.54.4' DOMAIN:'int-mx04.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'kwolf@redhat.com' RCPT:'' X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.187.233.73 Subject: [Qemu-devel] [PULL 7/7] qemu-img: align result of is_allocated_sectors X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kwolf@redhat.com, peter.maydell@linaro.org, qemu-devel@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: Peter Lieven We currently don't enforce that the sparse segments we detect during conver= t are aligned. This leads to unnecessary and costly read-modify-write cycles eith= er internally in Qemu or in the background on the storage device as nearly all modern filesystems or hardware have a 4k alignment internally. This patch modifies is_allocated_sectors so that its *pnum result will alwa= ys end at an alignment boundary. This way all requests will end at an alignment boundary. The start of all requests will also be aligned as long as the res= ults of get_block_status do not lead to an unaligned offset. The number of RMW cycles when converting an example image [1] to a raw devi= ce that has 4k sector size is about 4600 4k read requests to perform a total of abo= ut 15000 write requests. With this path the additional 4600 read requests are elimin= ated while the number of total write requests stays constant. [1] https://cloud-images.ubuntu.com/releases/16.04/release/ubuntu-16.04-ser= ver-cloudimg-amd64-disk1.vmdk Signed-off-by: Peter Lieven Signed-off-by: Kevin Wolf --- qemu-img.c | 44 ++++++++++++++++++++++++++++++++++++++----= -- tests/qemu-iotests/122.out | 18 ++++++++---------- 2 files changed, 46 insertions(+), 16 deletions(-) diff --git a/qemu-img.c b/qemu-img.c index f4074ebf75..4a7ce43dc9 100644 --- a/qemu-img.c +++ b/qemu-img.c @@ -1105,11 +1105,15 @@ static int64_t find_nonzero(const uint8_t *buf, int= 64_t n) * * 'pnum' is set to the number of sectors (including and immediately follo= wing * the first one) that are known to be in the same allocated/unallocated s= tate. + * The function will try to align the end offset to alignment boundaries so + * that the request will at least end aligned and consequtive requests will + * also start at an aligned offset. */ -static int is_allocated_sectors(const uint8_t *buf, int n, int *pnum) +static int is_allocated_sectors(const uint8_t *buf, int n, int *pnum, + int64_t sector_num, int alignment) { bool is_zero; - int i; + int i, tail; =20 if (n <=3D 0) { *pnum =3D 0; @@ -1122,6 +1126,23 @@ static int is_allocated_sectors(const uint8_t *buf, = int n, int *pnum) break; } } + + tail =3D (sector_num + i) & (alignment - 1); + if (tail) { + if (is_zero && i <=3D tail) { + /* treat unallocated areas which only consist + * of a small tail as allocated. */ + is_zero =3D false; + } + if (!is_zero) { + /* align up end offset of allocated areas. */ + i +=3D alignment - tail; + i =3D MIN(i, n); + } else { + /* align down end offset of zero areas. */ + i -=3D tail; + } + } *pnum =3D i; return !is_zero; } @@ -1132,7 +1153,7 @@ static int is_allocated_sectors(const uint8_t *buf, i= nt n, int *pnum) * breaking up write requests for only small sparse areas. */ static int is_allocated_sectors_min(const uint8_t *buf, int n, int *pnum, - int min) + int min, int64_t sector_num, int alignment) { int ret; int num_checked, num_used; @@ -1141,7 +1162,7 @@ static int is_allocated_sectors_min(const uint8_t *bu= f, int n, int *pnum, min =3D n; } =20 - ret =3D is_allocated_sectors(buf, n, pnum); + ret =3D is_allocated_sectors(buf, n, pnum, sector_num, alignment); if (!ret) { return ret; } @@ -1149,13 +1170,15 @@ static int is_allocated_sectors_min(const uint8_t *= buf, int n, int *pnum, num_used =3D *pnum; buf +=3D BDRV_SECTOR_SIZE * *pnum; n -=3D *pnum; + sector_num +=3D *pnum; num_checked =3D num_used; =20 while (n > 0) { - ret =3D is_allocated_sectors(buf, n, pnum); + ret =3D is_allocated_sectors(buf, n, pnum, sector_num, alignment); =20 buf +=3D BDRV_SECTOR_SIZE * *pnum; n -=3D *pnum; + sector_num +=3D *pnum; num_checked +=3D *pnum; if (ret) { num_used =3D num_checked; @@ -1560,6 +1583,7 @@ typedef struct ImgConvertState { bool wr_in_order; bool copy_range; int min_sparse; + int alignment; size_t cluster_sectors; size_t buf_sectors; long num_coroutines; @@ -1724,7 +1748,8 @@ static int coroutine_fn convert_co_write(ImgConvertSt= ate *s, int64_t sector_num, * zeroed. */ if (!s->min_sparse || (!s->compressed && - is_allocated_sectors_min(buf, n, &n, s->min_sparse)) || + is_allocated_sectors_min(buf, n, &n, s->min_sparse, + sector_num, s->alignment)) || (s->compressed && !buffer_is_zero(buf, n * BDRV_SECTOR_SIZE))) { @@ -2368,6 +2393,13 @@ static int img_convert(int argc, char **argv) out_bs->bl.pdiscard_alignment >> BDRV_SECTOR_BITS))); =20 + /* try to align the write requests to the destination to avoid unneces= sary + * RMW cycles. */ + s.alignment =3D MAX(pow2floor(s.min_sparse), + DIV_ROUND_UP(out_bs->bl.request_alignment, + BDRV_SECTOR_SIZE)); + assert(is_power_of_2(s.alignment)); + if (skip_create) { int64_t output_sectors =3D blk_nb_sectors(s.target); if (output_sectors < 0) { diff --git a/tests/qemu-iotests/122.out b/tests/qemu-iotests/122.out index 6c7ee1da6c..c576705284 100644 --- a/tests/qemu-iotests/122.out +++ b/tests/qemu-iotests/122.out @@ -194,12 +194,12 @@ wrote 1024/1024 bytes at offset 17408 1 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) =20 convert -S 4k -[{ "start": 0, "length": 1024, "depth": 0, "zero": false, "data": true, "o= ffset": OFFSET}, -{ "start": 1024, "length": 7168, "depth": 0, "zero": true, "data": false}, -{ "start": 8192, "length": 1024, "depth": 0, "zero": false, "data": true, = "offset": OFFSET}, -{ "start": 9216, "length": 8192, "depth": 0, "zero": true, "data": false}, -{ "start": 17408, "length": 1024, "depth": 0, "zero": false, "data": true,= "offset": OFFSET}, -{ "start": 18432, "length": 67090432, "depth": 0, "zero": true, "data": fa= lse}] +[{ "start": 0, "length": 4096, "depth": 0, "zero": false, "data": true, "o= ffset": OFFSET}, +{ "start": 4096, "length": 4096, "depth": 0, "zero": true, "data": false}, +{ "start": 8192, "length": 4096, "depth": 0, "zero": false, "data": true, = "offset": OFFSET}, +{ "start": 12288, "length": 4096, "depth": 0, "zero": true, "data": false}, +{ "start": 16384, "length": 4096, "depth": 0, "zero": false, "data": true,= "offset": OFFSET}, +{ "start": 20480, "length": 67088384, "depth": 0, "zero": true, "data": fa= lse}] =20 convert -c -S 4k [{ "start": 0, "length": 1024, "depth": 0, "zero": false, "data": true}, @@ -210,10 +210,8 @@ convert -c -S 4k { "start": 18432, "length": 67090432, "depth": 0, "zero": true, "data": fa= lse}] =20 convert -S 8k -[{ "start": 0, "length": 9216, "depth": 0, "zero": false, "data": true, "o= ffset": OFFSET}, -{ "start": 9216, "length": 8192, "depth": 0, "zero": true, "data": false}, -{ "start": 17408, "length": 1024, "depth": 0, "zero": false, "data": true,= "offset": OFFSET}, -{ "start": 18432, "length": 67090432, "depth": 0, "zero": true, "data": fa= lse}] +[{ "start": 0, "length": 24576, "depth": 0, "zero": false, "data": true, "= offset": OFFSET}, +{ "start": 24576, "length": 67084288, "depth": 0, "zero": true, "data": fa= lse}] =20 convert -c -S 8k [{ "start": 0, "length": 1024, "depth": 0, "zero": false, "data": true}, --=20 2.13.6