From nobody Tue Nov 4 19:07:45 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 1530788029357630.2587193292708; Thu, 5 Jul 2018 03:53:49 -0700 (PDT) Received: from localhost ([::1]:51787 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fb1tN-0006w1-1T for importer@patchew.org; Thu, 05 Jul 2018 06:53:41 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:35625) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fb1sW-0006a1-KO for qemu-devel@nongnu.org; Thu, 05 Jul 2018 06:52:50 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fb1sT-0003pW-FT for qemu-devel@nongnu.org; Thu, 05 Jul 2018 06:52:48 -0400 Received: from mx-v6.kamp.de ([2a02:248:0:51::16]:58514 helo=mx01.kamp.de) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fb1sT-0003dd-6s for qemu-devel@nongnu.org; Thu, 05 Jul 2018 06:52:45 -0400 Received: (qmail 28628 invoked by uid 89); 5 Jul 2018 10:52:37 -0000 Received: from [195.62.97.192] by client-16-kamp (envelope-from , uid 89) with qmail-scanner-2010/03/19-MF (clamdscan: 0.100.0/24724. avast: 1.2.2/17010300. spamassassin: 3.4.1. Clear:RC:1(195.62.97.192):. Processed in 0.11218 secs); 05 Jul 2018 10:52:37 -0000 Received: from kerio.kamp.de ([195.62.97.192]) by mx01.kamp.de with ESMTPS (DHE-RSA-AES256-SHA encrypted); 5 Jul 2018 10:52:36 -0000 Received: from submission.kamp.de ([195.62.97.28]) by kerio.kamp.de with ESMTPS (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256 bits)) for qemu-devel@nongnu.org; Thu, 5 Jul 2018 12:52:35 +0200 Received: (qmail 1867 invoked from network); 5 Jul 2018 10:52:36 -0000 Received: from lieven-vm.kamp-intra.net (HELO lieven-vm-neu) (relay@kamp.de@::ffff:172.21.12.69) by submission.kamp.de with ESMTPS (DHE-RSA-AES256-GCM-SHA384 encrypted) ESMTPA; 5 Jul 2018 10:52:36 -0000 Received: by lieven-vm-neu (Postfix, from userid 1060) id 0D61E2025C; Thu, 5 Jul 2018 12:52:36 +0200 (CEST) X-GL_Whitelist: yes X-Footer: a2FtcC5kZQ== From: Peter Lieven To: qemu-devel@nongnu.org, qemu-block@nongnu.org Date: Thu, 5 Jul 2018 12:52:30 +0200 Message-Id: <1530787950-25306-1-git-send-email-pl@kamp.de> X-Mailer: git-send-email 1.9.1 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2a02:248:0:51::16 Subject: [Qemu-devel] [PATCH V3] qemu-img: align result of is_allocated_sectors X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kwolf@redhat.com, Peter Lieven , mreitz@redhat.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" We currently don't enforce that the sparse segments we detect during conver= t are aligned. This leads to unnecessary and costly read-modify-write cycles eith= er internally in Qemu or in the background on the storage device as nearly all modern filesystems or hardware have a 4k alignment internally. The number of RMW cycles when converting an example image [1] to a raw devi= ce that has 4k sector size is about 4600 4k read requests to perform a total of abo= ut 15000 write requests. With this path the additional 4600 read requests are elimin= ated. [1] https://cloud-images.ubuntu.com/releases/16.04/release/ubuntu-16.04-ser= ver-cloudimg-amd64-disk1.vmdk Signed-off-by: Peter Lieven --- V2->V3: - ensure that s.alignment is a power of 2 - correctly handle n < alignment in is_allocated_sectors if sector_num % alignment > 0. V1->V2: - take the current sector offset into account [Max] - try to figure out the target alignment [Max] qemu-img.c | 46 ++++++++++++++++++++++++++++++++++++---------- 1 file changed, 36 insertions(+), 10 deletions(-) diff --git a/qemu-img.c b/qemu-img.c index e1a506f..db91b9e 100644 --- a/qemu-img.c +++ b/qemu-img.c @@ -1105,8 +1105,11 @@ static int64_t find_nonzero(const uint8_t *buf, int6= 4_t n) * * 'pnum' is set to the number of sectors (including and immediately follo= wing * the first one) that are known to be in the same allocated/unallocated s= tate. + * The function will try to align 'pnum' to the number of sectors specified + * in 'alignment' to avoid unnecassary RMW cycles on modern hardware. */ -static int is_allocated_sectors(const uint8_t *buf, int n, int *pnum) +static int is_allocated_sectors(const uint8_t *buf, int n, int *pnum, + int64_t sector_num, int alignment) { bool is_zero; int i; @@ -1115,14 +1118,26 @@ static int is_allocated_sectors(const uint8_t *buf,= int n, int *pnum) *pnum =3D 0; return 0; } - is_zero =3D buffer_is_zero(buf, 512); - for(i =3D 1; i < n; i++) { - buf +=3D 512; - if (is_zero !=3D buffer_is_zero(buf, 512)) { + + if (n % alignment) { + alignment =3D 1; + } + + if (sector_num % alignment) { + n =3D ROUND_UP(sector_num, alignment) - sector_num; + alignment =3D 1; + } + + n /=3D alignment; + + is_zero =3D buffer_is_zero(buf, BDRV_SECTOR_SIZE * alignment); + for (i =3D 1; i < n; i++) { + buf +=3D BDRV_SECTOR_SIZE * alignment; + if (is_zero !=3D buffer_is_zero(buf, BDRV_SECTOR_SIZE * alignment)= ) { break; } } - *pnum =3D i; + *pnum =3D i * alignment; return !is_zero; } =20 @@ -1132,7 +1147,7 @@ static int is_allocated_sectors(const uint8_t *buf, i= nt n, int *pnum) * breaking up write requests for only small sparse areas. */ static int is_allocated_sectors_min(const uint8_t *buf, int n, int *pnum, - int min) + int min, int64_t sector_num, int alignment) { int ret; int num_checked, num_used; @@ -1141,7 +1156,7 @@ static int is_allocated_sectors_min(const uint8_t *bu= f, int n, int *pnum, min =3D n; } =20 - ret =3D is_allocated_sectors(buf, n, pnum); + ret =3D is_allocated_sectors(buf, n, pnum, sector_num, alignment); if (!ret) { return ret; } @@ -1149,13 +1164,15 @@ static int is_allocated_sectors_min(const uint8_t *= buf, int n, int *pnum, num_used =3D *pnum; buf +=3D BDRV_SECTOR_SIZE * *pnum; n -=3D *pnum; + sector_num +=3D *pnum; num_checked =3D num_used; =20 while (n > 0) { - ret =3D is_allocated_sectors(buf, n, pnum); + ret =3D is_allocated_sectors(buf, n, pnum, sector_num, alignment); =20 buf +=3D BDRV_SECTOR_SIZE * *pnum; n -=3D *pnum; + sector_num +=3D *pnum; num_checked +=3D *pnum; if (ret) { num_used =3D num_checked; @@ -1560,6 +1577,7 @@ typedef struct ImgConvertState { bool wr_in_order; bool copy_range; int min_sparse; + int alignment; size_t cluster_sectors; size_t buf_sectors; long num_coroutines; @@ -1724,7 +1742,8 @@ static int coroutine_fn convert_co_write(ImgConvertSt= ate *s, int64_t sector_num, * zeroed. */ if (!s->min_sparse || (!s->compressed && - is_allocated_sectors_min(buf, n, &n, s->min_sparse)) || + is_allocated_sectors_min(buf, n, &n, s->min_sparse, + sector_num, s->alignment)) || (s->compressed && !buffer_is_zero(buf, n * BDRV_SECTOR_SIZE))) { @@ -2373,6 +2392,13 @@ static int img_convert(int argc, char **argv) out_bs->bl.pdiscard_alignment >> BDRV_SECTOR_BITS))); =20 + /* try to align the write requests to the destination to avoid unneces= sary + * RMW cycles. */ + s.alignment =3D MAX(pow2floor(s.min_sparse), + DIV_ROUND_UP(out_bs->bl.request_alignment, + BDRV_SECTOR_SIZE)); + assert(is_power_of_2(s.alignment)); + if (skip_create) { int64_t output_sectors =3D blk_nb_sectors(s.target); if (output_sectors < 0) { --=20 2.7.4