[Qemu-devel] [PATCH 1/4] block: add bdrv_get_format_alloc_stat format interface

Vladimir Sementsov-Ogievskiy posted 4 patches 8 years, 8 months ago
There is a newer version of this series
[Qemu-devel] [PATCH 1/4] block: add bdrv_get_format_alloc_stat format interface
Posted by Vladimir Sementsov-Ogievskiy 8 years, 8 months ago
The function should collect statistics, about allocted/unallocated by
top-level format driver space (in its .file) and allocation status
(allocated/hole/after eof) of corresponding areas in this .file.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block.c                   | 16 ++++++++++++++++
 include/block/block.h     |  3 +++
 include/block/block_int.h |  2 ++
 qapi/block-core.json      | 26 ++++++++++++++++++++++++++
 4 files changed, 47 insertions(+)

diff --git a/block.c b/block.c
index 50ba264143..7d720ae0c2 100644
--- a/block.c
+++ b/block.c
@@ -3407,6 +3407,22 @@ int64_t bdrv_get_allocated_file_size(BlockDriverState *bs)
 }
 
 /**
+ * Collect format allocation info. See BlockFormatAllocInfo definition in
+ * qapi/block-core.json.
+ */
+int bdrv_get_format_alloc_stat(BlockDriverState *bs, BlockFormatAllocInfo *bfai)
+{
+    BlockDriver *drv = bs->drv;
+    if (!drv) {
+        return -ENOMEDIUM;
+    }
+    if (drv->bdrv_get_format_alloc_stat) {
+        return drv->bdrv_get_format_alloc_stat(bs, bfai);
+    }
+    return -ENOTSUP;
+}
+
+/**
  * Return number of sectors on success, -errno on error.
  */
 int64_t bdrv_nb_sectors(BlockDriverState *bs)
diff --git a/include/block/block.h b/include/block/block.h
index 9b355e92d8..646376a772 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -335,6 +335,9 @@ typedef enum {
 
 int bdrv_check(BlockDriverState *bs, BdrvCheckResult *res, BdrvCheckMode fix);
 
+int bdrv_get_format_alloc_stat(BlockDriverState *bs,
+                               BlockFormatAllocInfo *bfai);
+
 /* The units of offset and total_work_size may be chosen arbitrarily by the
  * block driver; total_work_size may change during the course of the amendment
  * operation */
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 8d3724cce6..458c715e99 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -208,6 +208,8 @@ struct BlockDriver {
     int64_t (*bdrv_getlength)(BlockDriverState *bs);
     bool has_variable_length;
     int64_t (*bdrv_get_allocated_file_size)(BlockDriverState *bs);
+    int (*bdrv_get_format_alloc_stat)(BlockDriverState *bs,
+                                      BlockFormatAllocInfo *bfai);
 
     int coroutine_fn (*bdrv_co_pwritev_compressed)(BlockDriverState *bs,
         uint64_t offset, uint64_t bytes, QEMUIOVector *qiov);
diff --git a/qapi/block-core.json b/qapi/block-core.json
index ea0b3e8b13..365070b3eb 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -139,6 +139,32 @@
            '*format-specific': 'ImageInfoSpecific' } }
 
 ##
+# @BlockFormatAllocInfo:
+#
+# Information about allocations, including metadata. All fields are in bytes.
+#
+# @alloc_alloc: allocated by format driver and allocated in underlying file
+#
+# @alloc_hole: allocated by format driver but actually is a hole in
+#              underlying file
+#
+# @alloc_overhead: allocated by format driver after end of underlying file
+#
+# @hole_alloc: not allocated by format driver but allocated in underlying file
+#
+# @hole_hole: not allocated by format driver hole in underlying file
+#
+# Since: 2.10
+#
+##
+{ 'struct': 'BlockFormatAllocInfo',
+  'data': {'alloc_alloc':    'uint64',
+           'alloc_hole':     'uint64',
+           'alloc_overhead': 'uint64',
+           'hole_alloc':     'uint64',
+           'hole_hole':      'uint64' } }
+
+##
 # @ImageCheck:
 #
 # Information about a QEMU image file check
-- 
2.11.1


Re: [Qemu-devel] [PATCH 1/4] block: add bdrv_get_format_alloc_stat format interface
Posted by Eric Blake 8 years, 8 months ago
On 05/30/2017 05:48 AM, Vladimir Sementsov-Ogievskiy wrote:
> The function should collect statistics, about allocted/unallocated by
> top-level format driver space (in its .file) and allocation status
> (allocated/hole/after eof) of corresponding areas in this .file.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  block.c                   | 16 ++++++++++++++++
>  include/block/block.h     |  3 +++
>  include/block/block_int.h |  2 ++
>  qapi/block-core.json      | 26 ++++++++++++++++++++++++++
>  4 files changed, 47 insertions(+)
> 
> diff --git a/block.c b/block.c
> index 50ba264143..7d720ae0c2 100644
> --- a/block.c

> +++ b/qapi/block-core.json
> @@ -139,6 +139,32 @@
>             '*format-specific': 'ImageInfoSpecific' } }
>  
>  ##
> +# @BlockFormatAllocInfo:
> +#
> +# Information about allocations, including metadata. All fields are in bytes.
> +#
> +# @alloc_alloc: allocated by format driver and allocated in underlying file
> +#

New structs should favor naming with '-' rather than '_'. So this should
be 'alloc-alloc', if we like the name.

So this statistic represents the portions of the format file that are in
use by the format, and which are backed by data in the underlying protocol.

> +# @alloc_hole: allocated by format driver but actually is a hole in
> +#              underlying file

The portions of the format file in use by the format, but where the
entire cluster is a hole in the underlying file (note that with cluster
size large enough, you can get a cluster that is part-data/part-hole in
the underlying file, it looks like you are counting those as data).

> +#
> +# @alloc_overhead: allocated by format driver after end of underlying file

The portions of the format file in use by the format, but where the
entire cluster is beyond the end of the underlying file (the effective
hole).  Do we really need to distinguish between hole within the
underlying file and hole beyond the end of the file? Or can this number
be combined with the previous?

> +#
> +# @hole_alloc: not allocated by format driver but allocated in underlying file

I'm not sure I like this name.  "Hole" in file-system terminology means
"reads-as-zero" - but here we are talking about portions of the format
layer that are unallocated (defer to backing file, and we can't
guarantee they read as zero unless there is no backing file).

This statistic represents the portion of the underlying file that has
been previously allocated to hold data, but which is now unused by the
format layer (in other words, leaked clusters that can be reclaimed, or
which can be reused instead of growing the underlying the file if new
clusters are allocated).

> +#
> +# @hole_hole: not allocated by format driver hole in underlying file

This statistic represents fragmentation - portions of the format layer
that are no longer in use, but which are also occupying no space in the
underlying file.  A refragmentation operation (if we ever implemented
one) could remove the address space used by these clusters, but would
not change the disk usage.

> +#
> +# Since: 2.10
> +#
> +##
> +{ 'struct': 'BlockFormatAllocInfo',
> +  'data': {'alloc_alloc':    'uint64',
> +           'alloc_hole':     'uint64',
> +           'alloc_overhead': 'uint64',
> +           'hole_alloc':     'uint64',
> +           'hole_hole':      'uint64' } }

The idea seems okay, but the naming needs to be fixed.  Also, I'm not
sure if we need all 5, or if 4 is enough; and I'm not sure if we have
the right names ("how does alloc-hole differ from hole-alloc?"), or if
we can come up with something more descriptive.  Particularly since
"hole-" is not a hole in the filesystem sense, so much as unused
clusters.  But I'm also not coming up with better names to suggest at
the moment.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [Qemu-devel] [PATCH 1/4] block: add bdrv_get_format_alloc_stat format interface
Posted by Vladimir Sementsov-Ogievskiy 8 years, 8 months ago
30.05.2017 17:53, Eric Blake wrote:
> On 05/30/2017 05:48 AM, Vladimir Sementsov-Ogievskiy wrote:
>> The function should collect statistics, about allocted/unallocated by
>> top-level format driver space (in its .file) and allocation status
>> (allocated/hole/after eof) of corresponding areas in this .file.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   block.c                   | 16 ++++++++++++++++
>>   include/block/block.h     |  3 +++
>>   include/block/block_int.h |  2 ++
>>   qapi/block-core.json      | 26 ++++++++++++++++++++++++++
>>   4 files changed, 47 insertions(+)
>>
>> diff --git a/block.c b/block.c
>> index 50ba264143..7d720ae0c2 100644
>> --- a/block.c
>> +++ b/qapi/block-core.json
>> @@ -139,6 +139,32 @@
>>              '*format-specific': 'ImageInfoSpecific' } }
>>   
>>   ##
>> +# @BlockFormatAllocInfo:
>> +#
>> +# Information about allocations, including metadata. All fields are in bytes.

s/All fields are in bytes./All fields are in bytes and aligned by sector 
(512 bytes)./

- ok? to emphasize that there is nothing about clusters... Or may be 
better to write that they are aligned by byte.

>> +#
>> +# @alloc_alloc: allocated by format driver and allocated in underlying file
>> +#
> New structs should favor naming with '-' rather than '_'. So this should
> be 'alloc-alloc', if we like the name.
>
> So this statistic represents the portions of the format file that are in
> use by the format, and which are backed by data in the underlying protocol.
>
>> +# @alloc_hole: allocated by format driver but actually is a hole in
>> +#              underlying file
> The portions of the format file in use by the format, but where the
> entire cluster is a hole in the underlying file (note that with cluster
> size large enough, you can get a cluster that is part-data/part-hole in
> the underlying file, it looks like you are counting those as data).
>
>> +#
>> +# @alloc_overhead: allocated by format driver after end of underlying file
> The portions of the format file in use by the format, but where the
> entire cluster is beyond the end of the underlying file (the effective
> hole).  Do we really need to distinguish between hole within the
> underlying file and hole beyond the end of the file? Or can this number
> be combined with the previous?
>
>> +#
>> +# @hole_alloc: not allocated by format driver but allocated in underlying file
> I'm not sure I like this name.  "Hole" in file-system terminology means
> "reads-as-zero" - but here we are talking about portions of the format
> layer that are unallocated (defer to backing file, and we can't
> guarantee they read as zero unless there is no backing file).
>
> This statistic represents the portion of the underlying file that has
> been previously allocated to hold data, but which is now unused by the
> format layer (in other words, leaked clusters that can be reclaimed, or
> which can be reused instead of growing the underlying the file if new
> clusters are allocated).
>
>> +#
>> +# @hole_hole: not allocated by format driver hole in underlying file
> This statistic represents fragmentation - portions of the format layer
> that are no longer in use, but which are also occupying no space in the
> underlying file.  A refragmentation operation (if we ever implemented
> one) could remove the address space used by these clusters, but would
> not change the disk usage.
>
>> +#
>> +# Since: 2.10
>> +#
>> +##
>> +{ 'struct': 'BlockFormatAllocInfo',
>> +  'data': {'alloc_alloc':    'uint64',
>> +           'alloc_hole':     'uint64',
>> +           'alloc_overhead': 'uint64',
>> +           'hole_alloc':     'uint64',
>> +           'hole_hole':      'uint64' } }
> The idea seems okay, but the naming needs to be fixed.  Also, I'm not
> sure if we need all 5, or if 4 is enough; and I'm not sure if we have
> the right names ("how does alloc-hole differ from hole-alloc?"), or if
> we can come up with something more descriptive.  Particularly since
> "hole-" is not a hole in the filesystem sense, so much as unused
> clusters.  But I'm also not coming up with better names to suggest at
> the moment.
>
how about:

used-allocated
used-discarded
used-overrun

unused-allocated
unused-discarded


also, do you mention that your detailed wordings should be included into 
block-core.json or you just clarify things?


-- 
Best regards,
Vladimir


Re: [Qemu-devel] [PATCH 1/4] block: add bdrv_get_format_alloc_stat format interface
Posted by Eric Blake 8 years, 8 months ago
On 06/02/2017 10:26 AM, Vladimir Sementsov-Ogievskiy wrote:
> 30.05.2017 17:53, Eric Blake wrote:
>> On 05/30/2017 05:48 AM, Vladimir Sementsov-Ogievskiy wrote:
>>> The function should collect statistics, about allocted/unallocated by
>>> top-level format driver space (in its .file) and allocation status
>>> (allocated/hole/after eof) of corresponding areas in this .file.
>>>

>>> +# @BlockFormatAllocInfo:
>>> +#
>>> +# Information about allocations, including metadata. All fields are
>>> in bytes.
> 
> s/All fields are in bytes./All fields are in bytes and aligned by sector
> (512 bytes)./

I wouldn't even promise sector alignment. We probably happen to have
sector alignment (especially for qcow2, since the smallest cluster size
permitted is sector aligned), but I see no inherent reason why we can't
support sub-sector values if we are reporting in bytes.

> 
> - ok? to emphasize that there is nothing about clusters... Or may be
> better to write that they are aligned by byte.

I think "All fields are in bytes" is sufficient.


>>> +{ 'struct': 'BlockFormatAllocInfo',
>>> +  'data': {'alloc_alloc':    'uint64',
>>> +           'alloc_hole':     'uint64',
>>> +           'alloc_overhead': 'uint64',
>>> +           'hole_alloc':     'uint64',
>>> +           'hole_hole':      'uint64' } }
>> The idea seems okay, but the naming needs to be fixed.  Also, I'm not
>> sure if we need all 5, or if 4 is enough; and I'm not sure if we have
>> the right names ("how does alloc-hole differ from hole-alloc?"), or if
>> we can come up with something more descriptive.  Particularly since
>> "hole-" is not a hole in the filesystem sense, so much as unused
>> clusters.  But I'm also not coming up with better names to suggest at
>> the moment.
>>
> how about:
> 
> used-allocated
> used-discarded
> used-overrun
> 
> unused-allocated
> unused-discarded

Those work for me.

> 
> 
> also, do you mention that your detailed wordings should be included into
> block-core.json or you just clarify things?

Good documentation is worth the effort. I don't know if you want all of
my details in block-core.json, but giving a better overview of how each
statistic is possible does make it easier to visualize what the
statistic is actually counting.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org