From nobody Sat Feb 7 05:49:13 2026 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zoho.com; dkim=fail spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 149105804073247.93242467820687; Sat, 1 Apr 2017 07:47:20 -0700 (PDT) Received: from localhost ([::1]:51696 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cuKJD-00083I-8C for importer@patchew.org; Sat, 01 Apr 2017 10:47:19 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40561) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cuKGV-0005zv-VT for qemu-devel@nongnu.org; Sat, 01 Apr 2017 10:44:33 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cuKGU-00040h-40 for qemu-devel@nongnu.org; Sat, 01 Apr 2017 10:44:32 -0400 Received: from mail-pg0-x244.google.com ([2607:f8b0:400e:c05::244]:33318) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1cuKGT-000405-R9; Sat, 01 Apr 2017 10:44:30 -0400 Received: by mail-pg0-x244.google.com with SMTP id 79so22400527pgf.0; Sat, 01 Apr 2017 07:44:29 -0700 (PDT) Received: from linux.local ([27.251.197.196]) by smtp.gmail.com with ESMTPSA id h14sm9618461pgn.64.2017.04.01.07.44.25 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Sat, 01 Apr 2017 07:44:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=dt6PzM9XW/4Jk9nTg4bykvRoP4EthX6d/I4X4+60wr8=; b=n8fnl3okOBBHql6UmD5QI2lznTG1iAXWd0g1ZgKZYO3/pJVEjY8RhMbkac3qq4W0Vj QJ8Ht78kMJEerxfRXnydEsb1xrjXovEoaV8B6G9xuBbZHh511yYwhc1q/sTq39AWw51W Vz9/tsb6GVyjG6FkQvf8W+ggDfHT+gD7LvieIjYE/Julfxb5xv4GBv101UV9B1vjp8Zv M68Xvzq5ya7DmsKJoM6L8R3iSzDBnZP0VT6UGWcJ1bNlgoUXQnlYCAxgERSHNgGrcPXg oF4XeY4CKNx/fMezZHayUnS2YjZIc9t8/A2fqrnr4B0vX3QQqLhCWLaUoDf1O6hrMj/m GfPA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=dt6PzM9XW/4Jk9nTg4bykvRoP4EthX6d/I4X4+60wr8=; b=cVAqETCt7JRzjDflIUM4E7LpscqJYbCHQwTcDpYpLAIVY8XFnEo2bKXNHF8LmnTKvU q9Ntcpiw2Xd8Z71VYDykPf93QDaWuqRc6CdmYZ7ozdadPBzwJ/cOpCUM8SyBBmfvToIx NF/zQ+kOTqycYR/tyWvv6rZE/uvMUSa2NL751TGycCKVX78vx1WN1aJzP81XXTF6pdRc N0fQ9apCnBbX77ViA2EpMgE1YfzIpdnGMmfaV+ocd0q361QliMZy4ITC1SO10qujKnn2 NygSeDZzKbnPFCiec1FOeaIeRGiInW24FAjykqUngVScJU6Q6N40zOH0ToTpbLbjIylg zPqQ== X-Gm-Message-State: AFeK/H0nPe+Gpyc1+7zUd/hIZhdFrmKVg9hSsSj30jnzAX4RfUt0B8Nmond3bGIdhz25jQ== X-Received: by 10.99.53.195 with SMTP id c186mr8328395pga.182.1491057868711; Sat, 01 Apr 2017 07:44:28 -0700 (PDT) From: Ashijeet Acharya To: famz@redhat.com Date: Sat, 1 Apr 2017 20:14:35 +0530 Message-Id: <1491057878-27868-4-git-send-email-ashijeetacharya@gmail.com> X-Mailer: git-send-email 2.6.2 In-Reply-To: <1491057878-27868-1-git-send-email-ashijeetacharya@gmail.com> References: <1491057878-27868-1-git-send-email-ashijeetacharya@gmail.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2607:f8b0:400e:c05::244 Subject: [Qemu-devel] [PATCH v3 3/6] vmdk: New functions to assist allocating multiple clusters X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kwolf@redhat.com, qemu-block@nongnu.org, stefanha@gmail.com, qemu-devel@nongnu.org, mreitz@redhat.com, Ashijeet Acharya , jsnow@redhat.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Move the cluster tables loading code out of the existing get_cluster_offset() function to avoid code duplication and implement it in separate get_cluster_table() and vmdk_L2load() functions. Introduce two new helper functions handle_alloc() and vmdk_alloc_cluster_offset(). handle_alloc() helps to allocate multiple clusters at once starting from a given offset on disk and performs COW if necessary for first and last allocated clusters. vmdk_alloc_cluster_offset() helps to return the offset of the first of the many newly allocated clusters. Also, provide proper documentation for both. Signed-off-by: Ashijeet Acharya --- block/vmdk.c | 337 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-= ---- 1 file changed, 308 insertions(+), 29 deletions(-) diff --git a/block/vmdk.c b/block/vmdk.c index 73ae786..e5a289d 100644 --- a/block/vmdk.c +++ b/block/vmdk.c @@ -136,6 +136,7 @@ typedef struct VmdkMetaData { unsigned int l2_offset; int valid; uint32_t *l2_cache_entry; + uint32_t nb_clusters; } VmdkMetaData; =20 typedef struct VmdkGrainMarker { @@ -254,6 +255,14 @@ static inline uint64_t vmdk_find_offset_in_cluster(Vmd= kExtent *extent, return extent_relative_offset % cluster_size; } =20 +static inline uint64_t size_to_clusters(VmdkExtent *extent, uint64_t size) +{ + uint64_t cluster_size, round_off_size; + cluster_size =3D extent->cluster_sectors * BDRV_SECTOR_SIZE; + round_off_size =3D cluster_size - (size % cluster_size); + return DIV_ROUND_UP(size + round_off_size, BDRV_SECTOR_SIZE * 128) - 1; +} + static uint32_t vmdk_read_cid(BlockDriverState *bs, int parent) { char *desc; @@ -1028,6 +1037,133 @@ static void vmdk_refresh_limits(BlockDriverState *b= s, Error **errp) } } =20 +static int vmdk_L2update(VmdkExtent *extent, VmdkMetaData *m_data, + uint32_t offset) +{ + offset =3D cpu_to_le32(offset); + /* update L2 table */ + if (bdrv_pwrite_sync(extent->file, + ((int64_t)m_data->l2_offset * 512) + + (m_data->l2_index * sizeof(offset)), + &offset, sizeof(offset)) < 0) { + return VMDK_ERROR; + } + /* update backup L2 table */ + if (extent->l1_backup_table_offset !=3D 0) { + m_data->l2_offset =3D extent->l1_backup_table[m_data->l1_index]; + if (bdrv_pwrite_sync(extent->file, + ((int64_t)m_data->l2_offset * 512) + + (m_data->l2_index * sizeof(offset)), + &offset, sizeof(offset)) < 0) { + return VMDK_ERROR; + } + } + if (m_data->l2_cache_entry) { + *m_data->l2_cache_entry =3D offset; + } + + return VMDK_OK; +} + +/* + * vmdk_l2load + * + * Loads a new L2 table into memory. If the table is in the cache, the cac= he + * is used; otherwise the L2 table is loaded from the image file. + * + * Returns: + * VMDK_OK: on success + * VMDK_ERROR: in error cases + */ +static int vmdk_l2load(VmdkExtent *extent, uint64_t offset, int l2_offset, + uint32_t **new_l2_table, int *new_l2_index) +{ + int min_index, i, j; + uint32_t *l2_table; + uint32_t min_count; + + for (i =3D 0; i < L2_CACHE_SIZE; i++) { + if (l2_offset =3D=3D extent->l2_cache_offsets[i]) { + /* increment the hit count */ + if (++extent->l2_cache_counts[i] =3D=3D UINT32_MAX) { + for (j =3D 0; j < L2_CACHE_SIZE; j++) { + extent->l2_cache_counts[j] >>=3D 1; + } + } + l2_table =3D extent->l2_cache + (i * extent->l2_size); + goto found; + } + } + /* not found: load a new entry in the least used one */ + min_index =3D 0; + min_count =3D UINT32_MAX; + for (i =3D 0; i < L2_CACHE_SIZE; i++) { + if (extent->l2_cache_counts[i] < min_count) { + min_count =3D extent->l2_cache_counts[i]; + min_index =3D i; + } + } + l2_table =3D extent->l2_cache + (min_index * extent->l2_size); + if (bdrv_pread(extent->file, + (int64_t)l2_offset * 512, + l2_table, + extent->l2_size * sizeof(uint32_t) + ) !=3D extent->l2_size * sizeof(uint32_t)) { + return VMDK_ERROR; + } + + extent->l2_cache_offsets[min_index] =3D l2_offset; + extent->l2_cache_counts[min_index] =3D 1; +found: + *new_l2_index =3D ((offset >> 9) / extent->cluster_sectors) % extent->= l2_size; + *new_l2_table =3D l2_table; + + return VMDK_OK; +} + +/* + * get_cluster_table + * + * for a given offset, load (and allocate if needed) the l2 table. + * + * Returns: + * VMDK_OK: on success + * + * VMDK_UNALLOC: if cluster is not mapped + * + * VMDK_ERROR: in error cases + */ +static int get_cluster_table(VmdkExtent *extent, uint64_t offset, + int *new_l1_index, int *new_l2_offset, + int *new_l2_index, uint32_t **new_l2_table) +{ + int l1_index, l2_offset, l2_index; + uint32_t *l2_table; + int ret; + + offset -=3D (extent->end_sector - extent->sectors) * SECTOR_SIZE; + l1_index =3D (offset >> 9) / extent->l1_entry_sectors; + if (l1_index >=3D extent->l1_size) { + return VMDK_ERROR; + } + l2_offset =3D extent->l1_table[l1_index]; + if (!l2_offset) { + return VMDK_UNALLOC; + } + + ret =3D vmdk_l2load(extent, offset, l2_offset, &l2_table, &l2_index); + if (ret < 0) { + return ret; + } + + *new_l1_index =3D l1_index; + *new_l2_offset =3D l2_offset; + *new_l2_index =3D l2_index; + *new_l2_table =3D l2_table; + + return VMDK_OK; +} + /* * vmdk_perform_cow * @@ -1115,29 +1251,168 @@ exit: return ret; } =20 -static int vmdk_L2update(VmdkExtent *extent, VmdkMetaData *m_data, - uint32_t offset) +/* + * handle_alloc + * + * Allocates new clusters for an area that either is yet unallocated or ne= eds a + * copy on write. If *cluster_offset is non_zero, clusters are only alloca= ted if + * the new allocation can match the specified host offset. + * + * Returns: + * VMDK_OK: if new clusters were allocated, *bytes may be decrease= d if + * the new allocation doesn't cover all of the requested = area. + * *cluster_offset is updated to contain the offset of the + * first newly allocated cluster. + * + * VMDK_UNALLOC: if no clusters could be allocated. *cluster_offset is = left + * unchanged. + * + * VMDK_ERROR: in error cases + */ +static int handle_alloc(BlockDriverState *bs, VmdkExtent *extent, + uint64_t offset, uint64_t *cluster_offset, + int64_t *bytes, VmdkMetaData *m_data, + bool allocate, uint32_t *total_alloc_clusters) { - offset =3D cpu_to_le32(offset); - /* update L2 table */ - if (bdrv_pwrite_sync(extent->file, - ((int64_t)m_data->l2_offset * 512) - + (m_data->l2_index * sizeof(offset)), - &offset, sizeof(offset)) < 0) { - return VMDK_ERROR; + int l1_index, l2_offset, l2_index; + uint32_t *l2_table; + uint32_t cluster_sector; + uint32_t nb_clusters; + bool zeroed =3D false; + uint64_t skip_start_bytes, skip_end_bytes; + int ret; + + ret =3D get_cluster_table(extent, offset, &l1_index, &l2_offset, + &l2_index, &l2_table); + if (ret < 0) { + return ret; } - /* update backup L2 table */ - if (extent->l1_backup_table_offset !=3D 0) { - m_data->l2_offset =3D extent->l1_backup_table[m_data->l1_index]; - if (bdrv_pwrite_sync(extent->file, - ((int64_t)m_data->l2_offset * 512) - + (m_data->l2_index * sizeof(offset)), - &offset, sizeof(offset)) < 0) { - return VMDK_ERROR; + + cluster_sector =3D le32_to_cpu(l2_table[l2_index]); + + skip_start_bytes =3D vmdk_find_offset_in_cluster(extent, offset); + /* Calculate the number of clusters to look for. Here it will return o= ne + * cluster less than the actual value calculated as we may need to per= from + * COW for the last one. */ + nb_clusters =3D size_to_clusters(extent, skip_start_bytes + *bytes); + + nb_clusters =3D MIN(nb_clusters, extent->l2_size - l2_index); + assert(nb_clusters <=3D INT_MAX); + + /* update bytes according to final nb_clusters value */ + if (nb_clusters !=3D 0) { + *bytes =3D ((nb_clusters * extent->cluster_sectors) << 9) + - skip_start_bytes; + } else { + nb_clusters =3D 1; + } + *total_alloc_clusters +=3D nb_clusters; + skip_end_bytes =3D skip_start_bytes + MIN(*bytes, + extent->cluster_sectors * BDRV_SECTOR_SIZE + - skip_start_bytes); + + if (extent->has_zero_grain && cluster_sector =3D=3D VMDK_GTE_ZEROED) { + zeroed =3D true; + } + + if (!cluster_sector || zeroed) { + if (!allocate) { + return zeroed ? VMDK_ZEROED : VMDK_UNALLOC; + } + + cluster_sector =3D extent->next_cluster_sector; + extent->next_cluster_sector +=3D extent->cluster_sectors + * nb_clusters; + + ret =3D vmdk_perform_cow(bs, extent, cluster_sector * BDRV_SECTOR_= SIZE, + offset, skip_start_bytes, + skip_end_bytes); + if (ret < 0) { + return ret; + } + if (m_data) { + m_data->valid =3D 1; + m_data->l1_index =3D l1_index; + m_data->l2_index =3D l2_index; + m_data->l2_offset =3D l2_offset; + m_data->l2_cache_entry =3D &l2_table[l2_index]; + m_data->nb_clusters =3D nb_clusters; } } - if (m_data->l2_cache_entry) { - *m_data->l2_cache_entry =3D offset; + *cluster_offset =3D cluster_sector << BDRV_SECTOR_BITS; + return VMDK_OK; +} + +/* + * vmdk_alloc_cluster_offset + * + * For a given offset on the virtual disk, find the cluster offset in vmdk + * file. If the offset is not found, allocate a new cluster. + * + * If the cluster is newly allocated, m_data->nb_clusters is set to the nu= mber + * of contiguous clusters that have been allocated. In this case, the other + * fields of m_data are valid and contain information about the first allo= cated + * cluster. + * + * Returns: + * + * VMDK_OK: on success and @cluster_offset was set + * + * VMDK_UNALLOC: if no clusters were allocated and @cluster_offset = is + * set to zero + * + * VMDK_ERROR: in error cases + */ +static int vmdk_alloc_cluster_offset(BlockDriverState *bs, + VmdkExtent *extent, + VmdkMetaData *m_data, uint64_t offset, + bool allocate, uint64_t *cluster_offs= et, + int64_t bytes, + uint32_t *total_alloc_clusters) +{ + uint64_t start, remaining; + uint64_t new_cluster_offset; + int64_t n_bytes; + int ret; + + if (extent->flat) { + *cluster_offset =3D extent->flat_start_offset; + return VMDK_OK; + } + + start =3D offset; + remaining =3D bytes; + new_cluster_offset =3D 0; + *cluster_offset =3D 0; + n_bytes =3D 0; + if (m_data) { + m_data->valid =3D 0; + } + + /* due to L2 table margins all bytes may not get allocated at once */ + while (true) { + + if (!*cluster_offset) { + *cluster_offset =3D new_cluster_offset; + } + + start +=3D n_bytes; + remaining -=3D n_bytes; + new_cluster_offset +=3D n_bytes; + + if (remaining =3D=3D 0) { + break; + } + + n_bytes =3D remaining; + + ret =3D handle_alloc(bs, extent, start, &new_cluster_offset, &n_by= tes, + m_data, allocate, total_alloc_clusters); + + if (ret < 0) { + return ret; + + } } =20 return VMDK_OK; @@ -1567,6 +1842,7 @@ static int vmdk_pwritev(BlockDriverState *bs, uint64_= t offset, uint64_t cluster_offset; uint64_t bytes_done =3D 0; VmdkMetaData m_data; + uint32_t total_alloc_clusters =3D 0; =20 if (DIV_ROUND_UP(offset, BDRV_SECTOR_SIZE) > bs->total_sectors) { error_report("Wrong offset: offset=3D0x%" PRIx64 @@ -1584,10 +1860,10 @@ static int vmdk_pwritev(BlockDriverState *bs, uint6= 4_t offset, n_bytes =3D MIN(bytes, extent->cluster_sectors * BDRV_SECTOR_SIZE - offset_in_cluster); =20 - ret =3D get_cluster_offset(bs, extent, &m_data, offset, - !(extent->compressed || zeroed), - &cluster_offset, offset_in_cluster, - offset_in_cluster + n_bytes); + ret =3D vmdk_alloc_cluster_offset(bs, extent, &m_data, offset, + !(extent->compressed || zeroed), + &cluster_offset, n_bytes, + &total_alloc_clusters); if (extent->compressed) { if (ret =3D=3D VMDK_OK) { /* Refuse write to allocated cluster for streamOptimized */ @@ -1596,19 +1872,22 @@ static int vmdk_pwritev(BlockDriverState *bs, uint6= 4_t offset, return -EIO; } else { /* allocate */ - ret =3D get_cluster_offset(bs, extent, &m_data, offset, - true, &cluster_offset, 0, 0); + ret =3D vmdk_alloc_cluster_offset(bs, extent, &m_data, off= set, + true, &cluster_offset, n_bytes, + &total_alloc_clusters); } } if (ret =3D=3D VMDK_ERROR) { return -EINVAL; } + if (zeroed) { /* Do zeroed write, buf is ignored */ - if (extent->has_zero_grain && - offset_in_cluster =3D=3D 0 && - n_bytes >=3D extent->cluster_sectors * BDRV_SECTOR_SIZ= E) { - n_bytes =3D extent->cluster_sectors * BDRV_SECTOR_SIZE; + if (extent->has_zero_grain && offset_in_cluster =3D=3D 0 && + n_bytes >=3D extent->cluster_sectors * BDRV_SECTOR_SIZ= E * + total_alloc_clusters) { + n_bytes =3D extent->cluster_sectors * BDRV_SECTOR_SIZE * + total_alloc_clusters; if (!zero_dry_run) { /* update L2 tables */ if (vmdk_L2update(extent, &m_data, VMDK_GTE_ZEROED) --=20 2.6.2