[Qemu-devel] [PATCH v4 0/3] qemu-img check: format allocation info

Vladimir Sementsov-Ogievskiy posted 3 patches 6 years, 9 months ago
Failed in applying to current master (apply log)
block.c                   |  16 +++++
block/qcow2-refcount.c    | 152 ++++++++++++++++++++++++++++++++++++++++++++++
block/qcow2.c             |   2 +
block/qcow2.h             |   2 +
include/block/block.h     |   3 +
include/block/block_int.h |   2 +
qapi/block-core.json      |  78 +++++++++++++++++++++++-
qemu-img.c                |  42 +++++++++++++
8 files changed, 296 insertions(+), 1 deletion(-)
[Qemu-devel] [PATCH v4 0/3] qemu-img check: format allocation info
Posted by Vladimir Sementsov-Ogievskiy 6 years, 9 months ago
Hi all.

See 01 patch for the doc.

Question to discuss.
If I understand correctly get_block_status flags allocated, zero, and data
actually provide 5 possible combinations, which I combine into three.

allocated data zero
1         1    1    \__ data
1         1    0    /
1         0    1    \__ zero
0         0    1    /
0         0    0    ___ discarded

This division looks not bad, but it is not the only one possible.
Separating data is really useful - it shows leaked clusters..
So the question is, don't we want to adjust the division?
I'm ok with the current one.

v4: - reword docs in 01
    - s/2.10/2.11/ for qapi

v3: - improve docs
    - rename fields
    - add 'zero' type of underlying file portions status. It as these
      areas cannot be presented as 'discarded', but they are not
      occupying real space, so we don't want to present them as
      'allocated data' too.
    - remove last patch. It is not needed after introducing new naming
      for the fields

v2: fix build error, gcc things that some variables may be used
    uninitialized (actually they didn't).

v1: 

These series is a replacement for "qemu-img check: unallocated size"
series.

There was a question, should we account allocated clusters in qcow2 but
actually holes in underalying file as allocated or not. Instead of
hiding this information in one-number statistic I've decided to print
the whole information, 5 numbers:

For allocated by top-level format driver (qcow2 for ex.) clusters, 3
numbers: number of bytes, which are:
 - allocated in underlying file
 - holes in underlying file
 - after end of underlying file

To account other areas of underlying file, 2 more numbers of bytes:
 - unallocated by top-level driver but allocated in underlying file
 - unallocated by top-level driver and holes in underlying file

Vladimir Sementsov-Ogievskiy (3):
  block: add bdrv_get_format_alloc_stat format interface
  qcow2: add .bdrv_get_format_alloc_stat
  qemu-img check: add format allocation info

 block.c                   |  16 +++++
 block/qcow2-refcount.c    | 152 ++++++++++++++++++++++++++++++++++++++++++++++
 block/qcow2.c             |   2 +
 block/qcow2.h             |   2 +
 include/block/block.h     |   3 +
 include/block/block_int.h |   2 +
 qapi/block-core.json      |  78 +++++++++++++++++++++++-
 qemu-img.c                |  42 +++++++++++++
 8 files changed, 296 insertions(+), 1 deletion(-)

-- 
2.11.1


Re: [Qemu-devel] [PATCH v4 0/3] qemu-img check: format allocation info
Posted by Eric Blake 6 years, 9 months ago
On 07/29/2017 11:41 AM, Vladimir Sementsov-Ogievskiy wrote:
> Hi all.
> 
> See 01 patch for the doc.
> 
> Question to discuss.
> If I understand correctly get_block_status flags allocated, zero, and data
> actually provide 5 possible combinations, which I combine into three.

There are actually 8 possible bit combinations, but you are right that
some of them are in practice impossible (since the allocated bit can
only be set in cases where the underlying driver set the data or zero bit).

> 
> allocated data zero
> 1         1    1    \__ data

This one is interesting - it means we know the contents read as zero,
but that it occupies space on the disk instead of being a hole;
reporting it as zero may make it easier to punch a hole.

> 1         1    0    /

Yes, definitely data, and no clue if it can be turned into a hole.

> 1         0    1    \__ zero
> 0         0    1    /

Yes, definitely zero.  (The former happens when a format layer directly
reports that the current layer reads as zero; the latter is possible
when a format layer doesn't have an allocation, but where we know
unallocated clusters read as zero, perhaps because there is no backing
file to further fall back to).

> 0         0    0    ___ discarded

Could also mean hasn't been touched yet (discarded sort of implies that
it has been touched at some point in the past)

The other bit patterns:

  0         1    0    - not possible: if a driver sets data, then the
block layer sets allocated

  0         1    1    - ditto
  1         0    0    - not possible: nothing sets the allocated bit in
isolation

> 
> This division looks not bad, but it is not the only one possible.
> Separating data is really useful - it shows leaked clusters..
> So the question is, don't we want to adjust the division?
> I'm ok with the current one.
> 
-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [Qemu-devel] [PATCH v4 0/3] qemu-img check: format allocation info
Posted by Vladimir Sementsov-Ogievskiy 6 years, 9 months ago
31.07.2017 18:14, Eric Blake wrote:
> On 07/29/2017 11:41 AM, Vladimir Sementsov-Ogievskiy wrote:
>> Hi all.
>>
>> See 01 patch for the doc.
>>
>> Question to discuss.
>> If I understand correctly get_block_status flags allocated, zero, and data
>> actually provide 5 possible combinations, which I combine into three.
> There are actually 8 possible bit combinations, but you are right that
> some of them are in practice impossible (since the allocated bit can
> only be set in cases where the underlying driver set the data or zero bit).
>
>> allocated data zero
>> 1         1    1    \__ data
> This one is interesting - it means we know the contents read as zero,
> but that it occupies space on the disk instead of being a hole;
> reporting it as zero may make it easier to punch a hole.
>
>> 1         1    0    /
> Yes, definitely data, and no clue if it can be turned into a hole.
>
>> 1         0    1    \__ zero
>> 0         0    1    /
> Yes, definitely zero.  (The former happens when a format layer directly
> reports that the current layer reads as zero; the latter is possible
> when a format layer doesn't have an allocation, but where we know
> unallocated clusters read as zero, perhaps because there is no backing
> file to further fall back to).
>
>> 0         0    0    ___ discarded
> Could also mean hasn't been touched yet (discarded sort of implies that
> it has been touched at some point in the past)

last time I don't like it too. What about renaming it to just 
'unallocated'?

>
> The other bit patterns:
>
>    0         1    0    - not possible: if a driver sets data, then the
> block layer sets allocated
>
>    0         1    1    - ditto
>    1         0    0    - not possible: nothing sets the allocated bit in
> isolation
>
>> This division looks not bad, but it is not the only one possible.
>> Separating data is really useful - it shows leaked clusters..
>> So the question is, don't we want to adjust the division?
>> I'm ok with the current one.
>>

-- 
Best regards,
Vladimir