[v1] qcow2 check improvements

[Qemu-devel] [PATCH v2 7/7] block/qcow2-refcount: fix out-of-file L2 entries to be read-as-zero

Posted by Vladimir Sementsov-Ogievskiy 7 years, 5 months ago

Rewrite corrupted L2 table entry, which reference space out of
underlying file.

Make this L2 table entry read-as-all-zeros without any allocation.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/qcow2-refcount.c | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 3c004e5bfe..3de3768a3c 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -1720,8 +1720,30 @@ static int check_refcounts_l2(BlockDriverState *bs, BdrvCheckResult *res,
             /* Mark cluster as used */
             csize = (((l2_entry >> s->csize_shift) & s->csize_mask) + 1) *
                     BDRV_SECTOR_SIZE;
+            if (csize > s->cluster_size) {
+                ret = fix_l2_entry_to_zero(
+                        bs, res, fix, l2_offset, i, active,
+                        "compressed cluster larger than cluster: size 0x%"
+                        PRIx64, csize);
+                if (ret < 0) {
+                    goto fail;
+                }
+                continue;
+            }
+
             coffset = l2_entry & s->cluster_offset_mask &
                       ~(BDRV_SECTOR_SIZE - 1);
+            if (coffset >= bdrv_getlength(bs->file->bs)) {
+                ret = fix_l2_entry_to_zero(
+                        bs, res, fix, l2_offset, i, active,
+                        "compressed cluster out of file: offset 0x%" PRIx64,
+                        coffset);
+                if (ret < 0) {
+                    goto fail;
+                }
+                continue;
+            }
+
             ret = qcow2_inc_refcounts_imrt(bs, res,
                                            refcount_table, refcount_table_size,
                                            coffset, csize);
@@ -1748,6 +1770,16 @@ static int check_refcounts_l2(BlockDriverState *bs, BdrvCheckResult *res,
         {
             uint64_t offset = l2_entry & L2E_OFFSET_MASK;
 
+            if (offset >= bdrv_getlength(bs->file->bs)) {
+                ret = fix_l2_entry_to_zero(
+                        bs, res, fix, l2_offset, i, active,
+                        "cluster out of file: offset 0x%" PRIx64, offset);
+                if (ret < 0) {
+                    goto fail;
+                }
+                continue;
+            }
+
             if (flags & CHECK_FRAG_INFO) {
                 res->bfi.allocated_clusters++;
                 if (next_contiguous_offset &&
-- 
2.11.1

Re: [Qemu-devel] [PATCH v2 7/7] block/qcow2-refcount: fix out-of-file L2 entries to be read-as-zero

Posted by Max Reitz 7 years, 4 months ago

On 17.08.18 14:22, Vladimir Sementsov-Ogievskiy wrote:
> Rewrite corrupted L2 table entry, which reference space out of
> underlying file.
> 
> Make this L2 table entry read-as-all-zeros without any allocation.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  block/qcow2-refcount.c | 32 ++++++++++++++++++++++++++++++++
>  1 file changed, 32 insertions(+)
> 
> diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
> index 3c004e5bfe..3de3768a3c 100644
> --- a/block/qcow2-refcount.c
> +++ b/block/qcow2-refcount.c
> @@ -1720,8 +1720,30 @@ static int check_refcounts_l2(BlockDriverState *bs, BdrvCheckResult *res,
>              /* Mark cluster as used */
>              csize = (((l2_entry >> s->csize_shift) & s->csize_mask) + 1) *
>                      BDRV_SECTOR_SIZE;
> +            if (csize > s->cluster_size) {
> +                ret = fix_l2_entry_to_zero(
> +                        bs, res, fix, l2_offset, i, active,
> +                        "compressed cluster larger than cluster: size 0x%"
> +                        PRIx64, csize);
> +                if (ret < 0) {
> +                    goto fail;
> +                }
> +                continue;
> +            }
> +

This seems recoverable, isn't it?  Can we not try to just limit the
csize, or decompress the cluster with the given csize from the given
offset, disregarding the cluster limit?

>              coffset = l2_entry & s->cluster_offset_mask &
>                        ~(BDRV_SECTOR_SIZE - 1);
> +            if (coffset >= bdrv_getlength(bs->file->bs)) {
> +                ret = fix_l2_entry_to_zero(
> +                        bs, res, fix, l2_offset, i, active,
> +                        "compressed cluster out of file: offset 0x%" PRIx64,
> +                        coffset);
> +                if (ret < 0) {
> +                    goto fail;
> +                }
> +                continue;
> +            }
> +
>              ret = qcow2_inc_refcounts_imrt(bs, res,
>                                             refcount_table, refcount_table_size,
>                                             coffset, csize);
> @@ -1748,6 +1770,16 @@ static int check_refcounts_l2(BlockDriverState *bs, BdrvCheckResult *res,
>          {
>              uint64_t offset = l2_entry & L2E_OFFSET_MASK;
>  
> +            if (offset >= bdrv_getlength(bs->file->bs)) {
> +                ret = fix_l2_entry_to_zero(
> +                        bs, res, fix, l2_offset, i, active,
> +                        "cluster out of file: offset 0x%" PRIx64, offset);
> +                if (ret < 0) {
> +                    goto fail;
> +                }
> +                continue;
> +            }
> +

These other two look OK, but they have another issue:  If this is a v2
image, you cannot create zero clusters; so you'll have to unallocate the
cluster in that case.

Max

Re: [Qemu-devel] [PATCH v2 7/7] block/qcow2-refcount: fix out-of-file L2 entries to be read-as-zero

Posted by Vladimir Sementsov-Ogievskiy 7 years, 4 months ago


On 10/08/2018 11:51 PM, Max Reitz wrote:
> On 17.08.18 14:22, Vladimir Sementsov-Ogievskiy wrote:
>> Rewrite corrupted L2 table entry, which reference space out of
>> underlying file.
>>
>> Make this L2 table entry read-as-all-zeros without any allocation.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   block/qcow2-refcount.c | 32 ++++++++++++++++++++++++++++++++
>>   1 file changed, 32 insertions(+)
>>
>> diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
>> index 3c004e5bfe..3de3768a3c 100644
>> --- a/block/qcow2-refcount.c
>> +++ b/block/qcow2-refcount.c
>> @@ -1720,8 +1720,30 @@ static int check_refcounts_l2(BlockDriverState *bs, BdrvCheckResult *res,
>>               /* Mark cluster as used */
>>               csize = (((l2_entry >> s->csize_shift) & s->csize_mask) + 1) *
>>                       BDRV_SECTOR_SIZE;
>> +            if (csize > s->cluster_size) {
>> +                ret = fix_l2_entry_to_zero(
>> +                        bs, res, fix, l2_offset, i, active,
>> +                        "compressed cluster larger than cluster: size 0x%"
>> +                        PRIx64, csize);
>> +                if (ret < 0) {
>> +                    goto fail;
>> +                }
>> +                continue;
>> +            }
>> +
> 
> This seems recoverable, isn't it?  Can we not try to just limit the
> csize, or decompress the cluster with the given csize from the given
> offset, disregarding the cluster limit?

Hm, you want to assume that csize is corrupted but coffset may be 
correct? Unlikely, I think.

So, to carefully repair csize, we should decompress one cluster (or one 
cluster - 1 byte) of data, trying to get one cluster of decompressed 
data. If we succeed, we know csize, or we can safely set it to one cluster.

Or we can just set csize = 1 cluster, if it is larger. And leave 
problems to real execution which will lead to EIO in worst case.

> 
>>               coffset = l2_entry & s->cluster_offset_mask &
>>                         ~(BDRV_SECTOR_SIZE - 1);
>> +            if (coffset >= bdrv_getlength(bs->file->bs)) {
>> +                ret = fix_l2_entry_to_zero(
>> +                        bs, res, fix, l2_offset, i, active,
>> +                        "compressed cluster out of file: offset 0x%" PRIx64,
>> +                        coffset);
>> +                if (ret < 0) {
>> +                    goto fail;
>> +                }
>> +                continue;
>> +            }
>> +
>>               ret = qcow2_inc_refcounts_imrt(bs, res,
>>                                              refcount_table, refcount_table_size,
>>                                              coffset, csize);
>> @@ -1748,6 +1770,16 @@ static int check_refcounts_l2(BlockDriverState *bs, BdrvCheckResult *res,
>>           {
>>               uint64_t offset = l2_entry & L2E_OFFSET_MASK;
>>   
>> +            if (offset >= bdrv_getlength(bs->file->bs)) {
>> +                ret = fix_l2_entry_to_zero(
>> +                        bs, res, fix, l2_offset, i, active,
>> +                        "cluster out of file: offset 0x%" PRIx64, offset);
>> +                if (ret < 0) {
>> +                    goto fail;
>> +                }
>> +                continue;
>> +            }
>> +
> 
> These other two look OK, but they have another issue:  If this is a v2
> image, you cannot create zero clusters; so you'll have to unallocate the
> cluster in that case.


Oho, it's a problem. It may be unsafe to discard clusters, making 
backing image available through the holes. What discard do on v2? 
Zeroing or holes?


> 
> Max
>

Re: [Qemu-devel] [PATCH v2 7/7] block/qcow2-refcount: fix out-of-file L2 entries to be read-as-zero

Posted by Max Reitz 7 years, 4 months ago

On 09.10.18 00:02, Vladimir Sementsov-Ogievskiy wrote:
> 
> 
> On 10/08/2018 11:51 PM, Max Reitz wrote:
>> On 17.08.18 14:22, Vladimir Sementsov-Ogievskiy wrote:
>>> Rewrite corrupted L2 table entry, which reference space out of
>>> underlying file.
>>>
>>> Make this L2 table entry read-as-all-zeros without any allocation.
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>> ---
>>>   block/qcow2-refcount.c | 32 ++++++++++++++++++++++++++++++++
>>>   1 file changed, 32 insertions(+)
>>>
>>> diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
>>> index 3c004e5bfe..3de3768a3c 100644
>>> --- a/block/qcow2-refcount.c
>>> +++ b/block/qcow2-refcount.c
>>> @@ -1720,8 +1720,30 @@ static int check_refcounts_l2(BlockDriverState *bs, BdrvCheckResult *res,
>>>               /* Mark cluster as used */
>>>               csize = (((l2_entry >> s->csize_shift) & s->csize_mask) + 1) *
>>>                       BDRV_SECTOR_SIZE;
>>> +            if (csize > s->cluster_size) {
>>> +                ret = fix_l2_entry_to_zero(
>>> +                        bs, res, fix, l2_offset, i, active,
>>> +                        "compressed cluster larger than cluster: size 0x%"
>>> +                        PRIx64, csize);
>>> +                if (ret < 0) {
>>> +                    goto fail;
>>> +                }
>>> +                continue;
>>> +            }
>>> +
>>
>> This seems recoverable, isn't it?  Can we not try to just limit the
>> csize, or decompress the cluster with the given csize from the given
>> offset, disregarding the cluster limit?
> 
> Hm, you want to assume that csize is corrupted but coffset may be 
> correct? Unlikely, I think.

Better to reconstruct probably garbage data than to definitely garbage
data (all zeroes) is what I think.

> So, to carefully repair csize, we should decompress one cluster (or one 
> cluster - 1 byte) of data, trying to get one cluster of decompressed 
> data. If we succeed, we know csize, or we can safely set it to one cluster.

Yes.

> Or we can just set csize = 1 cluster, if it is larger. And leave 
> problems to real execution which will lead to EIO in worst case.

Or this, yes.

>>>               coffset = l2_entry & s->cluster_offset_mask &
>>>                         ~(BDRV_SECTOR_SIZE - 1);
>>> +            if (coffset >= bdrv_getlength(bs->file->bs)) {
>>> +                ret = fix_l2_entry_to_zero(
>>> +                        bs, res, fix, l2_offset, i, active,
>>> +                        "compressed cluster out of file: offset 0x%" PRIx64,
>>> +                        coffset);
>>> +                if (ret < 0) {
>>> +                    goto fail;
>>> +                }
>>> +                continue;
>>> +            }
>>> +
>>>               ret = qcow2_inc_refcounts_imrt(bs, res,
>>>                                              refcount_table, refcount_table_size,
>>>                                              coffset, csize);
>>> @@ -1748,6 +1770,16 @@ static int check_refcounts_l2(BlockDriverState *bs, BdrvCheckResult *res,
>>>           {
>>>               uint64_t offset = l2_entry & L2E_OFFSET_MASK;
>>>   
>>> +            if (offset >= bdrv_getlength(bs->file->bs)) {
>>> +                ret = fix_l2_entry_to_zero(
>>> +                        bs, res, fix, l2_offset, i, active,
>>> +                        "cluster out of file: offset 0x%" PRIx64, offset);
>>> +                if (ret < 0) {
>>> +                    goto fail;
>>> +                }
>>> +                continue;
>>> +            }
>>> +
>>
>> These other two look OK, but they have another issue:  If this is a v2
>> image, you cannot create zero clusters; so you'll have to unallocate the
>> cluster in that case.
> 
> 
> Oho, it's a problem. It may be unsafe to discard clusters, making 
> backing image available through the holes. What discard do on v2? 
> Zeroing or holes?

Oh, right!  discard on v2 punches a hole.  So I see three ways:
(1) You can do the same and point to that bit of code, or
(2) You allocate a data cluster full of zeroes in case of v2, or
(3) You just error out.

(3) doesn't seem like the worst option.  Amending the image to be v3 is
always possible and trivial.  Maybe point the user to that option.

Max

Re: [Qemu-devel] [PATCH v2 7/7] block/qcow2-refcount: fix out-of-file L2 entries to be read-as-zero

Posted by Vladimir Sementsov-Ogievskiy 7 years, 4 months ago


On 10/09/2018 01:08 AM, Max Reitz wrote:
> On 09.10.18 00:02, Vladimir Sementsov-Ogievskiy wrote:
>>
>>
>> On 10/08/2018 11:51 PM, Max Reitz wrote:
>>> On 17.08.18 14:22, Vladimir Sementsov-Ogievskiy wrote:
>>>> Rewrite corrupted L2 table entry, which reference space out of
>>>> underlying file.
>>>>
>>>> Make this L2 table entry read-as-all-zeros without any allocation.
>>>>
>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>>> ---
>>>>    block/qcow2-refcount.c | 32 ++++++++++++++++++++++++++++++++
>>>>    1 file changed, 32 insertions(+)
>>>>
>>>> diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
>>>> index 3c004e5bfe..3de3768a3c 100644
>>>> --- a/block/qcow2-refcount.c
>>>> +++ b/block/qcow2-refcount.c
>>>> @@ -1720,8 +1720,30 @@ static int check_refcounts_l2(BlockDriverState *bs, BdrvCheckResult *res,
>>>>                /* Mark cluster as used */
>>>>                csize = (((l2_entry >> s->csize_shift) & s->csize_mask) + 1) *
>>>>                        BDRV_SECTOR_SIZE;
>>>> +            if (csize > s->cluster_size) {
>>>> +                ret = fix_l2_entry_to_zero(
>>>> +                        bs, res, fix, l2_offset, i, active,
>>>> +                        "compressed cluster larger than cluster: size 0x%"
>>>> +                        PRIx64, csize);
>>>> +                if (ret < 0) {
>>>> +                    goto fail;
>>>> +                }
>>>> +                continue;
>>>> +            }
>>>> +
>>>
>>> This seems recoverable, isn't it?  Can we not try to just limit the
>>> csize, or decompress the cluster with the given csize from the given
>>> offset, disregarding the cluster limit?
>>
>> Hm, you want to assume that csize is corrupted but coffset may be
>> correct? Unlikely, I think.
> 
> Better to reconstruct probably garbage data than to definitely garbage
> data (all zeroes) is what I think.
> 
>> So, to carefully repair csize, we should decompress one cluster (or one
>> cluster - 1 byte) of data, trying to get one cluster of decompressed
>> data. If we succeed, we know csize, or we can safely set it to one cluster.
> 
> Yes.
> 
>> Or we can just set csize = 1 cluster, if it is larger. And leave
>> problems to real execution which will lead to EIO in worst case.
> 
> Or this, yes.
> 
>>>>                coffset = l2_entry & s->cluster_offset_mask &
>>>>                          ~(BDRV_SECTOR_SIZE - 1);
>>>> +            if (coffset >= bdrv_getlength(bs->file->bs)) {
>>>> +                ret = fix_l2_entry_to_zero(
>>>> +                        bs, res, fix, l2_offset, i, active,
>>>> +                        "compressed cluster out of file: offset 0x%" PRIx64,
>>>> +                        coffset);
>>>> +                if (ret < 0) {
>>>> +                    goto fail;
>>>> +                }
>>>> +                continue;
>>>> +            }
>>>> +
>>>>                ret = qcow2_inc_refcounts_imrt(bs, res,
>>>>                                               refcount_table, refcount_table_size,
>>>>                                               coffset, csize);
>>>> @@ -1748,6 +1770,16 @@ static int check_refcounts_l2(BlockDriverState *bs, BdrvCheckResult *res,
>>>>            {
>>>>                uint64_t offset = l2_entry & L2E_OFFSET_MASK;
>>>>    
>>>> +            if (offset >= bdrv_getlength(bs->file->bs)) {
>>>> +                ret = fix_l2_entry_to_zero(
>>>> +                        bs, res, fix, l2_offset, i, active,
>>>> +                        "cluster out of file: offset 0x%" PRIx64, offset);
>>>> +                if (ret < 0) {
>>>> +                    goto fail;
>>>> +                }
>>>> +                continue;
>>>> +            }
>>>> +
>>>
>>> These other two look OK, but they have another issue:  If this is a v2
>>> image, you cannot create zero clusters; so you'll have to unallocate the
>>> cluster in that case.
>>
>>
>> Oho, it's a problem. It may be unsafe to discard clusters, making
>> backing image available through the holes. What discard do on v2?
>> Zeroing or holes?
> 
> Oh, right!  discard on v2 punches a hole.  So I see three ways:
> (1) You can do the same and point to that bit of code, or
> (2) You allocate a data cluster full of zeroes in case of v2, or
> (3) You just error out.
> 
> (3) doesn't seem like the worst option.  

> Amending the image to be v3 is
> always possible and trivial. 

how to do it for corrupted image?

> Maybe point the user to that option.
> 
> Max
>

Re: [Qemu-devel] [PATCH v2 7/7] block/qcow2-refcount: fix out-of-file L2 entries to be read-as-zero

Posted by Max Reitz 7 years, 4 months ago

On 09.10.18 00:14, Vladimir Sementsov-Ogievskiy wrote:
> 
> 
> On 10/09/2018 01:08 AM, Max Reitz wrote:
>> On 09.10.18 00:02, Vladimir Sementsov-Ogievskiy wrote:
>>>
>>>
>>> On 10/08/2018 11:51 PM, Max Reitz wrote:
>>>> On 17.08.18 14:22, Vladimir Sementsov-Ogievskiy wrote:
>>>>> Rewrite corrupted L2 table entry, which reference space out of
>>>>> underlying file.
>>>>>
>>>>> Make this L2 table entry read-as-all-zeros without any allocation.
>>>>>
>>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>>>> ---
>>>>>    block/qcow2-refcount.c | 32 ++++++++++++++++++++++++++++++++
>>>>>    1 file changed, 32 insertions(+)
>>>>>
>>>>> diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
>>>>> index 3c004e5bfe..3de3768a3c 100644
>>>>> --- a/block/qcow2-refcount.c
>>>>> +++ b/block/qcow2-refcount.c
>>>>> @@ -1720,8 +1720,30 @@ static int check_refcounts_l2(BlockDriverState *bs, BdrvCheckResult *res,
>>>>>                /* Mark cluster as used */
>>>>>                csize = (((l2_entry >> s->csize_shift) & s->csize_mask) + 1) *
>>>>>                        BDRV_SECTOR_SIZE;
>>>>> +            if (csize > s->cluster_size) {
>>>>> +                ret = fix_l2_entry_to_zero(
>>>>> +                        bs, res, fix, l2_offset, i, active,
>>>>> +                        "compressed cluster larger than cluster: size 0x%"
>>>>> +                        PRIx64, csize);
>>>>> +                if (ret < 0) {
>>>>> +                    goto fail;
>>>>> +                }
>>>>> +                continue;
>>>>> +            }
>>>>> +
>>>>
>>>> This seems recoverable, isn't it?  Can we not try to just limit the
>>>> csize, or decompress the cluster with the given csize from the given
>>>> offset, disregarding the cluster limit?
>>>
>>> Hm, you want to assume that csize is corrupted but coffset may be
>>> correct? Unlikely, I think.
>>
>> Better to reconstruct probably garbage data than to definitely garbage
>> data (all zeroes) is what I think.
>>
>>> So, to carefully repair csize, we should decompress one cluster (or one
>>> cluster - 1 byte) of data, trying to get one cluster of decompressed
>>> data. If we succeed, we know csize, or we can safely set it to one cluster.
>>
>> Yes.
>>
>>> Or we can just set csize = 1 cluster, if it is larger. And leave
>>> problems to real execution which will lead to EIO in worst case.
>>
>> Or this, yes.
>>
>>>>>                coffset = l2_entry & s->cluster_offset_mask &
>>>>>                          ~(BDRV_SECTOR_SIZE - 1);
>>>>> +            if (coffset >= bdrv_getlength(bs->file->bs)) {
>>>>> +                ret = fix_l2_entry_to_zero(
>>>>> +                        bs, res, fix, l2_offset, i, active,
>>>>> +                        "compressed cluster out of file: offset 0x%" PRIx64,
>>>>> +                        coffset);
>>>>> +                if (ret < 0) {
>>>>> +                    goto fail;
>>>>> +                }
>>>>> +                continue;
>>>>> +            }
>>>>> +
>>>>>                ret = qcow2_inc_refcounts_imrt(bs, res,
>>>>>                                               refcount_table, refcount_table_size,
>>>>>                                               coffset, csize);
>>>>> @@ -1748,6 +1770,16 @@ static int check_refcounts_l2(BlockDriverState *bs, BdrvCheckResult *res,
>>>>>            {
>>>>>                uint64_t offset = l2_entry & L2E_OFFSET_MASK;
>>>>>    
>>>>> +            if (offset >= bdrv_getlength(bs->file->bs)) {
>>>>> +                ret = fix_l2_entry_to_zero(
>>>>> +                        bs, res, fix, l2_offset, i, active,
>>>>> +                        "cluster out of file: offset 0x%" PRIx64, offset);
>>>>> +                if (ret < 0) {
>>>>> +                    goto fail;
>>>>> +                }
>>>>> +                continue;
>>>>> +            }
>>>>> +
>>>>
>>>> These other two look OK, but they have another issue:  If this is a v2
>>>> image, you cannot create zero clusters; so you'll have to unallocate the
>>>> cluster in that case.
>>>
>>>
>>> Oho, it's a problem. It may be unsafe to discard clusters, making
>>> backing image available through the holes. What discard do on v2?
>>> Zeroing or holes?
>>
>> Oh, right!  discard on v2 punches a hole.  So I see three ways:
>> (1) You can do the same and point to that bit of code, or
>> (2) You allocate a data cluster full of zeroes in case of v2, or
>> (3) You just error out.
>>
>> (3) doesn't seem like the worst option.  
> 
>> Amending the image to be v3 is
>> always possible and trivial. 
> 
> how to do it for corrupted image?

Oh, yeah, you can't open a corrupted image, can you...  I suppose we
want a way to force-clear the flag anyway. :-)

Max

Re: [Qemu-devel] [PATCH v2 7/7] block/qcow2-refcount: fix out-of-file L2 entries to be read-as-zero

Posted by Vladimir Sementsov-Ogievskiy 7 years, 4 months ago


On 10/09/2018 01:21 AM, Max Reitz wrote:
> On 09.10.18 00:14, Vladimir Sementsov-Ogievskiy wrote:
>>
>>
>> On 10/09/2018 01:08 AM, Max Reitz wrote:
>>> On 09.10.18 00:02, Vladimir Sementsov-Ogievskiy wrote:
>>>>
>>>>
>>>> On 10/08/2018 11:51 PM, Max Reitz wrote:
>>>>> On 17.08.18 14:22, Vladimir Sementsov-Ogievskiy wrote:
>>>>>> Rewrite corrupted L2 table entry, which reference space out of
>>>>>> underlying file.
>>>>>>
>>>>>> Make this L2 table entry read-as-all-zeros without any allocation.
>>>>>>
>>>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>>>>> ---
>>>>>>     block/qcow2-refcount.c | 32 ++++++++++++++++++++++++++++++++
>>>>>>     1 file changed, 32 insertions(+)
>>>>>>
>>>>>> diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
>>>>>> index 3c004e5bfe..3de3768a3c 100644
>>>>>> --- a/block/qcow2-refcount.c
>>>>>> +++ b/block/qcow2-refcount.c
>>>>>> @@ -1720,8 +1720,30 @@ static int check_refcounts_l2(BlockDriverState *bs, BdrvCheckResult *res,
>>>>>>                 /* Mark cluster as used */
>>>>>>                 csize = (((l2_entry >> s->csize_shift) & s->csize_mask) + 1) *
>>>>>>                         BDRV_SECTOR_SIZE;
>>>>>> +            if (csize > s->cluster_size) {
>>>>>> +                ret = fix_l2_entry_to_zero(
>>>>>> +                        bs, res, fix, l2_offset, i, active,
>>>>>> +                        "compressed cluster larger than cluster: size 0x%"
>>>>>> +                        PRIx64, csize);
>>>>>> +                if (ret < 0) {
>>>>>> +                    goto fail;
>>>>>> +                }
>>>>>> +                continue;
>>>>>> +            }
>>>>>> +
>>>>>
>>>>> This seems recoverable, isn't it?  Can we not try to just limit the
>>>>> csize, or decompress the cluster with the given csize from the given
>>>>> offset, disregarding the cluster limit?
>>>>
>>>> Hm, you want to assume that csize is corrupted but coffset may be
>>>> correct? Unlikely, I think.
>>>
>>> Better to reconstruct probably garbage data than to definitely garbage
>>> data (all zeroes) is what I think.
>>>
>>>> So, to carefully repair csize, we should decompress one cluster (or one
>>>> cluster - 1 byte) of data, trying to get one cluster of decompressed
>>>> data. If we succeed, we know csize, or we can safely set it to one cluster.
>>>
>>> Yes.
>>>
>>>> Or we can just set csize = 1 cluster, if it is larger. And leave
>>>> problems to real execution which will lead to EIO in worst case.
>>>
>>> Or this, yes.
>>>
>>>>>>                 coffset = l2_entry & s->cluster_offset_mask &
>>>>>>                           ~(BDRV_SECTOR_SIZE - 1);
>>>>>> +            if (coffset >= bdrv_getlength(bs->file->bs)) {
>>>>>> +                ret = fix_l2_entry_to_zero(
>>>>>> +                        bs, res, fix, l2_offset, i, active,
>>>>>> +                        "compressed cluster out of file: offset 0x%" PRIx64,
>>>>>> +                        coffset);
>>>>>> +                if (ret < 0) {
>>>>>> +                    goto fail;
>>>>>> +                }
>>>>>> +                continue;
>>>>>> +            }
>>>>>> +
>>>>>>                 ret = qcow2_inc_refcounts_imrt(bs, res,
>>>>>>                                                refcount_table, refcount_table_size,
>>>>>>                                                coffset, csize);
>>>>>> @@ -1748,6 +1770,16 @@ static int check_refcounts_l2(BlockDriverState *bs, BdrvCheckResult *res,
>>>>>>             {
>>>>>>                 uint64_t offset = l2_entry & L2E_OFFSET_MASK;
>>>>>>     
>>>>>> +            if (offset >= bdrv_getlength(bs->file->bs)) {
>>>>>> +                ret = fix_l2_entry_to_zero(
>>>>>> +                        bs, res, fix, l2_offset, i, active,
>>>>>> +                        "cluster out of file: offset 0x%" PRIx64, offset);
>>>>>> +                if (ret < 0) {
>>>>>> +                    goto fail;
>>>>>> +                }
>>>>>> +                continue;
>>>>>> +            }
>>>>>> +
>>>>>
>>>>> These other two look OK, but they have another issue:  If this is a v2
>>>>> image, you cannot create zero clusters; so you'll have to unallocate the
>>>>> cluster in that case.
>>>>
>>>>
>>>> Oho, it's a problem. It may be unsafe to discard clusters, making
>>>> backing image available through the holes. What discard do on v2?
>>>> Zeroing or holes?
>>>
>>> Oh, right!  discard on v2 punches a hole.  So I see three ways:
>>> (1) You can do the same and point to that bit of code, or
>>> (2) You allocate a data cluster full of zeroes in case of v2, or
>>> (3) You just error out.
>>>
>>> (3) doesn't seem like the worst option.
>>
>>> Amending the image to be v3 is
>>> always possible and trivial.
>>
>> how to do it for corrupted image?
> 
> Oh, yeah, you can't open a corrupted image, can you...  I suppose we
> want a way to force-clear the flag anyway. :-)

am, which flag?

> 
> Max
>

Re: [Qemu-devel] [PATCH v2 7/7] block/qcow2-refcount: fix out-of-file L2 entries to be read-as-zero

Posted by Max Reitz 7 years, 3 months ago

On 09.10.18 01:14, Vladimir Sementsov-Ogievskiy wrote:
> 
> 
> On 10/09/2018 01:21 AM, Max Reitz wrote:
>> On 09.10.18 00:14, Vladimir Sementsov-Ogievskiy wrote:
>>>
>>>
>>> On 10/09/2018 01:08 AM, Max Reitz wrote:
>>>> On 09.10.18 00:02, Vladimir Sementsov-Ogievskiy wrote:
>>>>>
>>>>>
>>>>> On 10/08/2018 11:51 PM, Max Reitz wrote:
>>>>>> On 17.08.18 14:22, Vladimir Sementsov-Ogievskiy wrote:
>>>>>>> Rewrite corrupted L2 table entry, which reference space out of
>>>>>>> underlying file.
>>>>>>>
>>>>>>> Make this L2 table entry read-as-all-zeros without any allocation.
>>>>>>>
>>>>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>>>>>> ---
>>>>>>>     block/qcow2-refcount.c | 32 ++++++++++++++++++++++++++++++++
>>>>>>>     1 file changed, 32 insertions(+)
>>>>>>>
>>>>>>> diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
>>>>>>> index 3c004e5bfe..3de3768a3c 100644
>>>>>>> --- a/block/qcow2-refcount.c
>>>>>>> +++ b/block/qcow2-refcount.c
>>>>>>> @@ -1720,8 +1720,30 @@ static int check_refcounts_l2(BlockDriverState *bs, BdrvCheckResult *res,
>>>>>>>                 /* Mark cluster as used */
>>>>>>>                 csize = (((l2_entry >> s->csize_shift) & s->csize_mask) + 1) *
>>>>>>>                         BDRV_SECTOR_SIZE;
>>>>>>> +            if (csize > s->cluster_size) {
>>>>>>> +                ret = fix_l2_entry_to_zero(
>>>>>>> +                        bs, res, fix, l2_offset, i, active,
>>>>>>> +                        "compressed cluster larger than cluster: size 0x%"
>>>>>>> +                        PRIx64, csize);
>>>>>>> +                if (ret < 0) {
>>>>>>> +                    goto fail;
>>>>>>> +                }
>>>>>>> +                continue;
>>>>>>> +            }
>>>>>>> +
>>>>>>
>>>>>> This seems recoverable, isn't it?  Can we not try to just limit the
>>>>>> csize, or decompress the cluster with the given csize from the given
>>>>>> offset, disregarding the cluster limit?
>>>>>
>>>>> Hm, you want to assume that csize is corrupted but coffset may be
>>>>> correct? Unlikely, I think.
>>>>
>>>> Better to reconstruct probably garbage data than to definitely garbage
>>>> data (all zeroes) is what I think.
>>>>
>>>>> So, to carefully repair csize, we should decompress one cluster (or one
>>>>> cluster - 1 byte) of data, trying to get one cluster of decompressed
>>>>> data. If we succeed, we know csize, or we can safely set it to one cluster.
>>>>
>>>> Yes.
>>>>
>>>>> Or we can just set csize = 1 cluster, if it is larger. And leave
>>>>> problems to real execution which will lead to EIO in worst case.
>>>>
>>>> Or this, yes.
>>>>
>>>>>>>                 coffset = l2_entry & s->cluster_offset_mask &
>>>>>>>                           ~(BDRV_SECTOR_SIZE - 1);
>>>>>>> +            if (coffset >= bdrv_getlength(bs->file->bs)) {
>>>>>>> +                ret = fix_l2_entry_to_zero(
>>>>>>> +                        bs, res, fix, l2_offset, i, active,
>>>>>>> +                        "compressed cluster out of file: offset 0x%" PRIx64,
>>>>>>> +                        coffset);
>>>>>>> +                if (ret < 0) {
>>>>>>> +                    goto fail;
>>>>>>> +                }
>>>>>>> +                continue;
>>>>>>> +            }
>>>>>>> +
>>>>>>>                 ret = qcow2_inc_refcounts_imrt(bs, res,
>>>>>>>                                                refcount_table, refcount_table_size,
>>>>>>>                                                coffset, csize);
>>>>>>> @@ -1748,6 +1770,16 @@ static int check_refcounts_l2(BlockDriverState *bs, BdrvCheckResult *res,
>>>>>>>             {
>>>>>>>                 uint64_t offset = l2_entry & L2E_OFFSET_MASK;
>>>>>>>     
>>>>>>> +            if (offset >= bdrv_getlength(bs->file->bs)) {
>>>>>>> +                ret = fix_l2_entry_to_zero(
>>>>>>> +                        bs, res, fix, l2_offset, i, active,
>>>>>>> +                        "cluster out of file: offset 0x%" PRIx64, offset);
>>>>>>> +                if (ret < 0) {
>>>>>>> +                    goto fail;
>>>>>>> +                }
>>>>>>> +                continue;
>>>>>>> +            }
>>>>>>> +
>>>>>>
>>>>>> These other two look OK, but they have another issue:  If this is a v2
>>>>>> image, you cannot create zero clusters; so you'll have to unallocate the
>>>>>> cluster in that case.
>>>>>
>>>>>
>>>>> Oho, it's a problem. It may be unsafe to discard clusters, making
>>>>> backing image available through the holes. What discard do on v2?
>>>>> Zeroing or holes?
>>>>
>>>> Oh, right!  discard on v2 punches a hole.  So I see three ways:
>>>> (1) You can do the same and point to that bit of code, or
>>>> (2) You allocate a data cluster full of zeroes in case of v2, or
>>>> (3) You just error out.
>>>>
>>>> (3) doesn't seem like the worst option.
>>>
>>>> Amending the image to be v3 is
>>>> always possible and trivial.
>>>
>>> how to do it for corrupted image?
>>
>> Oh, yeah, you can't open a corrupted image, can you...  I suppose we
>> want a way to force-clear the flag anyway. :-)
> 
> am, which flag?

The corrupt flag in the image header.

Max

Re: [Qemu-devel] [PATCH v2 7/7] block/qcow2-refcount: fix out-of-file L2 entries to be read-as-zero

Posted by Vladimir Sementsov-Ogievskiy 7 years, 4 months ago

17.08.2018 15:22, Vladimir Sementsov-Ogievskiy wrote:
> Rewrite corrupted L2 table entry, which reference space out of
> underlying file.
>
> Make this L2 table entry read-as-all-zeros without any allocation.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>   block/qcow2-refcount.c | 32 ++++++++++++++++++++++++++++++++
>   1 file changed, 32 insertions(+)
>
> diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
> index 3c004e5bfe..3de3768a3c 100644
> --- a/block/qcow2-refcount.c
> +++ b/block/qcow2-refcount.c
> @@ -1720,8 +1720,30 @@ static int check_refcounts_l2(BlockDriverState *bs, BdrvCheckResult *res,
>               /* Mark cluster as used */
>               csize = (((l2_entry >> s->csize_shift) & s->csize_mask) + 1) *
>                       BDRV_SECTOR_SIZE;
> +            if (csize > s->cluster_size) {
> +                ret = fix_l2_entry_to_zero(
> +                        bs, res, fix, l2_offset, i, active,
> +                        "compressed cluster larger than cluster: size 0x%"
> +                        PRIx64, csize);
> +                if (ret < 0) {
> +                    goto fail;
> +                }
> +                continue;
> +            }
> +
>               coffset = l2_entry & s->cluster_offset_mask &
>                         ~(BDRV_SECTOR_SIZE - 1);
> +            if (coffset >= bdrv_getlength(bs->file->bs)) {
> +                ret = fix_l2_entry_to_zero(
> +                        bs, res, fix, l2_offset, i, active,
> +                        "compressed cluster out of file: offset 0x%" PRIx64,
> +                        coffset);
> +                if (ret < 0) {
> +                    goto fail;
> +                }
> +                continue;
> +            }
> +
>               ret = qcow2_inc_refcounts_imrt(bs, res,
>                                              refcount_table, refcount_table_size,
>                                              coffset, csize);
> @@ -1748,6 +1770,16 @@ static int check_refcounts_l2(BlockDriverState *bs, BdrvCheckResult *res,
>           {
>               uint64_t offset = l2_entry & L2E_OFFSET_MASK;
>   
> +            if (offset >= bdrv_getlength(bs->file->bs)) {
> +                ret = fix_l2_entry_to_zero(
> +                        bs, res, fix, l2_offset, i, active,
> +                        "cluster out of file: offset 0x%" PRIx64, offset);
> +                if (ret < 0) {
> +                    goto fail;
> +                }
> +                continue;
> +            }
> +
>               if (flags & CHECK_FRAG_INFO) {
>                   res->bfi.allocated_clusters++;
>                   if (next_contiguous_offset &&

hmm, interesting question here: in case of misaligned l2 entry, we zero 
it out only for QCOW2_CLUSTER_ZERO_ALLOC, but not for normal clusters? 
Why? I think it is ok to mark as zero misaligned normal cluster l2 
entry, otherwise we'll have fatal corruption on any operation to this 
cluster.

-- 
Best regards,
Vladimir

Re: [Qemu-devel] [PATCH v2 7/7] block/qcow2-refcount: fix out-of-file L2 entries to be read-as-zero

Posted by Vladimir Sementsov-Ogievskiy 7 years, 4 months ago

10.10.2018 19:39, Vladimir Sementsov-Ogievskiy wrote:
> 17.08.2018 15:22, Vladimir Sementsov-Ogievskiy wrote:
>> Rewrite corrupted L2 table entry, which reference space out of
>> underlying file.
>>
>> Make this L2 table entry read-as-all-zeros without any allocation.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   block/qcow2-refcount.c | 32 ++++++++++++++++++++++++++++++++
>>   1 file changed, 32 insertions(+)
>>
>> diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
>> index 3c004e5bfe..3de3768a3c 100644
>> --- a/block/qcow2-refcount.c
>> +++ b/block/qcow2-refcount.c
>> @@ -1720,8 +1720,30 @@ static int check_refcounts_l2(BlockDriverState 
>> *bs, BdrvCheckResult *res,
>>               /* Mark cluster as used */
>>               csize = (((l2_entry >> s->csize_shift) & s->csize_mask) 
>> + 1) *
>>                       BDRV_SECTOR_SIZE;
>> +            if (csize > s->cluster_size) {
>> +                ret = fix_l2_entry_to_zero(
>> +                        bs, res, fix, l2_offset, i, active,
>> +                        "compressed cluster larger than cluster: 
>> size 0x%"
>> +                        PRIx64, csize);
>> +                if (ret < 0) {
>> +                    goto fail;
>> +                }
>> +                continue;
>> +            }
>> +
>>               coffset = l2_entry & s->cluster_offset_mask &
>>                         ~(BDRV_SECTOR_SIZE - 1);
>> +            if (coffset >= bdrv_getlength(bs->file->bs)) {
>> +                ret = fix_l2_entry_to_zero(
>> +                        bs, res, fix, l2_offset, i, active,
>> +                        "compressed cluster out of file: offset 0x%" 
>> PRIx64,
>> +                        coffset);
>> +                if (ret < 0) {
>> +                    goto fail;
>> +                }
>> +                continue;
>> +            }
>> +
>>               ret = qcow2_inc_refcounts_imrt(bs, res,
>>                                              refcount_table, 
>> refcount_table_size,
>>                                              coffset, csize);
>> @@ -1748,6 +1770,16 @@ static int check_refcounts_l2(BlockDriverState 
>> *bs, BdrvCheckResult *res,
>>           {
>>               uint64_t offset = l2_entry & L2E_OFFSET_MASK;
>>   +            if (offset >= bdrv_getlength(bs->file->bs)) {
>> +                ret = fix_l2_entry_to_zero(
>> +                        bs, res, fix, l2_offset, i, active,
>> +                        "cluster out of file: offset 0x%" PRIx64, 
>> offset);
>> +                if (ret < 0) {
>> +                    goto fail;
>> +                }
>> +                continue;
>> +            }
>> +
>>               if (flags & CHECK_FRAG_INFO) {
>>                   res->bfi.allocated_clusters++;
>>                   if (next_contiguous_offset &&
>
> hmm, interesting question here: in case of misaligned l2 entry, we 
> zero it out only for QCOW2_CLUSTER_ZERO_ALLOC, but not for normal 
> clusters? Why? I think it is ok to mark as zero misaligned normal 
> cluster l2 entry, otherwise we'll have fatal corruption on any 
> operation to this cluster.
>

or we can just align them down.

-- 
Best regards,
Vladimir

Re: [Qemu-devel] [PATCH v2 7/7] block/qcow2-refcount: fix out-of-file L2 entries to be read-as-zero

Posted by Vladimir Sementsov-Ogievskiy 7 years, 4 months ago

10.10.2018 19:55, Vladimir Sementsov-Ogievskiy wrote:
> 10.10.2018 19:39, Vladimir Sementsov-Ogievskiy wrote:
>> 17.08.2018 15:22, Vladimir Sementsov-Ogievskiy wrote:
>>> Rewrite corrupted L2 table entry, which reference space out of
>>> underlying file.
>>>
>>> Make this L2 table entry read-as-all-zeros without any allocation.
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>> ---
>>>   block/qcow2-refcount.c | 32 ++++++++++++++++++++++++++++++++
>>>   1 file changed, 32 insertions(+)
>>>
>>> diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
>>> index 3c004e5bfe..3de3768a3c 100644
>>> --- a/block/qcow2-refcount.c
>>> +++ b/block/qcow2-refcount.c
>>> @@ -1720,8 +1720,30 @@ static int 
>>> check_refcounts_l2(BlockDriverState *bs, BdrvCheckResult *res,
>>>               /* Mark cluster as used */
>>>               csize = (((l2_entry >> s->csize_shift) & 
>>> s->csize_mask) + 1) *
>>>                       BDRV_SECTOR_SIZE;
>>> +            if (csize > s->cluster_size) {
>>> +                ret = fix_l2_entry_to_zero(
>>> +                        bs, res, fix, l2_offset, i, active,
>>> +                        "compressed cluster larger than cluster: 
>>> size 0x%"
>>> +                        PRIx64, csize);
>>> +                if (ret < 0) {
>>> +                    goto fail;
>>> +                }
>>> +                continue;
>>> +            }
>>> +
>>>               coffset = l2_entry & s->cluster_offset_mask &
>>>                         ~(BDRV_SECTOR_SIZE - 1);
>>> +            if (coffset >= bdrv_getlength(bs->file->bs)) {
>>> +                ret = fix_l2_entry_to_zero(
>>> +                        bs, res, fix, l2_offset, i, active,
>>> +                        "compressed cluster out of file: offset 
>>> 0x%" PRIx64,
>>> +                        coffset);
>>> +                if (ret < 0) {
>>> +                    goto fail;
>>> +                }
>>> +                continue;
>>> +            }
>>> +
>>>               ret = qcow2_inc_refcounts_imrt(bs, res,
>>>                                              refcount_table, 
>>> refcount_table_size,
>>>                                              coffset, csize);
>>> @@ -1748,6 +1770,16 @@ static int 
>>> check_refcounts_l2(BlockDriverState *bs, BdrvCheckResult *res,
>>>           {
>>>               uint64_t offset = l2_entry & L2E_OFFSET_MASK;
>>>   +            if (offset >= bdrv_getlength(bs->file->bs)) {
>>> +                ret = fix_l2_entry_to_zero(
>>> +                        bs, res, fix, l2_offset, i, active,
>>> +                        "cluster out of file: offset 0x%" PRIx64, 
>>> offset);
>>> +                if (ret < 0) {
>>> +                    goto fail;
>>> +                }
>>> +                continue;
>>> +            }
>>> +
>>>               if (flags & CHECK_FRAG_INFO) {
>>>                   res->bfi.allocated_clusters++;
>>>                   if (next_contiguous_offset &&
>>
>> hmm, interesting question here: in case of misaligned l2 entry, we 
>> zero it out only for QCOW2_CLUSTER_ZERO_ALLOC, but not for normal 
>> clusters? Why? I think it is ok to mark as zero misaligned normal 
>> cluster l2 entry, otherwise we'll have fatal corruption on any 
>> operation to this cluster.
>>
>
> or we can just align them down.
>

and why do we calculate refcounts for corrupted l2 entry? Is it correct, 
to consider data range referenced by this entry, if we'll never success 
in writing or reading this data?

-- 
Best regards,
Vladimir

Re: [Qemu-devel] [PATCH v2 7/7] block/qcow2-refcount: fix out-of-file L2 entries to be read-as-zero

Posted by Max Reitz 7 years, 3 months ago

On 10.10.18 18:59, Vladimir Sementsov-Ogievskiy wrote:
> 10.10.2018 19:55, Vladimir Sementsov-Ogievskiy wrote:
>> 10.10.2018 19:39, Vladimir Sementsov-Ogievskiy wrote:
>>> 17.08.2018 15:22, Vladimir Sementsov-Ogievskiy wrote:
>>>> Rewrite corrupted L2 table entry, which reference space out of
>>>> underlying file.
>>>>
>>>> Make this L2 table entry read-as-all-zeros without any allocation.
>>>>
>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>>> ---
>>>>   block/qcow2-refcount.c | 32 ++++++++++++++++++++++++++++++++
>>>>   1 file changed, 32 insertions(+)
>>>>
>>>> diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
>>>> index 3c004e5bfe..3de3768a3c 100644
>>>> --- a/block/qcow2-refcount.c
>>>> +++ b/block/qcow2-refcount.c
>>>> @@ -1720,8 +1720,30 @@ static int 
>>>> check_refcounts_l2(BlockDriverState *bs, BdrvCheckResult *res,
>>>>               /* Mark cluster as used */
>>>>               csize = (((l2_entry >> s->csize_shift) & 
>>>> s->csize_mask) + 1) *
>>>>                       BDRV_SECTOR_SIZE;
>>>> +            if (csize > s->cluster_size) {
>>>> +                ret = fix_l2_entry_to_zero(
>>>> +                        bs, res, fix, l2_offset, i, active,
>>>> +                        "compressed cluster larger than cluster: 
>>>> size 0x%"
>>>> +                        PRIx64, csize);
>>>> +                if (ret < 0) {
>>>> +                    goto fail;
>>>> +                }
>>>> +                continue;
>>>> +            }
>>>> +
>>>>               coffset = l2_entry & s->cluster_offset_mask &
>>>>                         ~(BDRV_SECTOR_SIZE - 1);
>>>> +            if (coffset >= bdrv_getlength(bs->file->bs)) {
>>>> +                ret = fix_l2_entry_to_zero(
>>>> +                        bs, res, fix, l2_offset, i, active,
>>>> +                        "compressed cluster out of file: offset 
>>>> 0x%" PRIx64,
>>>> +                        coffset);
>>>> +                if (ret < 0) {
>>>> +                    goto fail;
>>>> +                }
>>>> +                continue;
>>>> +            }
>>>> +
>>>>               ret = qcow2_inc_refcounts_imrt(bs, res,
>>>>                                              refcount_table, 
>>>> refcount_table_size,
>>>>                                              coffset, csize);
>>>> @@ -1748,6 +1770,16 @@ static int 
>>>> check_refcounts_l2(BlockDriverState *bs, BdrvCheckResult *res,
>>>>           {
>>>>               uint64_t offset = l2_entry & L2E_OFFSET_MASK;
>>>>   +            if (offset >= bdrv_getlength(bs->file->bs)) {
>>>> +                ret = fix_l2_entry_to_zero(
>>>> +                        bs, res, fix, l2_offset, i, active,
>>>> +                        "cluster out of file: offset 0x%" PRIx64, 
>>>> offset);
>>>> +                if (ret < 0) {
>>>> +                    goto fail;
>>>> +                }
>>>> +                continue;
>>>> +            }
>>>> +
>>>>               if (flags & CHECK_FRAG_INFO) {
>>>>                   res->bfi.allocated_clusters++;
>>>>                   if (next_contiguous_offset &&
>>>
>>> hmm, interesting question here: in case of misaligned l2 entry, we 
>>> zero it out only for QCOW2_CLUSTER_ZERO_ALLOC, but not for normal 
>>> clusters? Why? I think it is ok to mark as zero misaligned normal 
>>> cluster l2 entry, otherwise we'll have fatal corruption on any 
>>> operation to this cluster.

Because for zero clusters the solution is clear.  We just throw away the
obviously wrong preallocation information, but the cluster data stays
the same (zero).  So there is no data loss.

For normal clusters, you definitely destroy the data by zeroing them out.

>> or we can just align them down.

Which would destroy the data as well.

You can argue that if the value is misaligned, it is extremely likely to
be just garbage as a whole, though.  But in any case, it is not obvious
what to do and always means data loss (which is different from zero
clusters, where you can just keep them zero).

The clearest and most obvious solution would be to allocate a new
cluster and copy the unaligned data there.  Maybe that doesn't make
sense because the data is probably garbage anyway, but it definitely
won't harm.

> and why do we calculate refcounts for corrupted l2 entry? Is it correct, 
> to consider data range referenced by this entry, if we'll never success 
> in writing or reading this data?

It's definitely better to mark something wrongly as referenced than
wrongly as free.

The only difference it makes is that maybe we could save some space, but
if there are any such corruptions, saving space really is the least of
the users issues.

MAx

Re: [Qemu-devel] [PATCH v2 7/7] block/qcow2-refcount: fix out-of-file L2 entries to be read-as-zero

Posted by Vladimir Sementsov-Ogievskiy 7 years, 1 month ago

13.10.2018 15:58, Max Reitz wrote:
> On 10.10.18 18:59, Vladimir Sementsov-Ogievskiy wrote:
>> 10.10.2018 19:55, Vladimir Sementsov-Ogievskiy wrote:
>>> 10.10.2018 19:39, Vladimir Sementsov-Ogievskiy wrote:
>>>> 17.08.2018 15:22, Vladimir Sementsov-Ogievskiy wrote:
>>>>> Rewrite corrupted L2 table entry, which reference space out of
>>>>> underlying file.
>>>>>
>>>>> Make this L2 table entry read-as-all-zeros without any allocation.
>>>>>
>>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>>>> ---
>>>>>    block/qcow2-refcount.c | 32 ++++++++++++++++++++++++++++++++
>>>>>    1 file changed, 32 insertions(+)
>>>>>
>>>>> diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
>>>>> index 3c004e5bfe..3de3768a3c 100644
>>>>> --- a/block/qcow2-refcount.c
>>>>> +++ b/block/qcow2-refcount.c
>>>>> @@ -1720,8 +1720,30 @@ static int
>>>>> check_refcounts_l2(BlockDriverState *bs, BdrvCheckResult *res,
>>>>>                /* Mark cluster as used */
>>>>>                csize = (((l2_entry >> s->csize_shift) &
>>>>> s->csize_mask) + 1) *
>>>>>                        BDRV_SECTOR_SIZE;
>>>>> +            if (csize > s->cluster_size) {
>>>>> +                ret = fix_l2_entry_to_zero(
>>>>> +                        bs, res, fix, l2_offset, i, active,
>>>>> +                        "compressed cluster larger than cluster:
>>>>> size 0x%"
>>>>> +                        PRIx64, csize);
>>>>> +                if (ret < 0) {
>>>>> +                    goto fail;
>>>>> +                }
>>>>> +                continue;
>>>>> +            }
>>>>> +
>>>>>                coffset = l2_entry & s->cluster_offset_mask &
>>>>>                          ~(BDRV_SECTOR_SIZE - 1);
>>>>> +            if (coffset >= bdrv_getlength(bs->file->bs)) {
>>>>> +                ret = fix_l2_entry_to_zero(
>>>>> +                        bs, res, fix, l2_offset, i, active,
>>>>> +                        "compressed cluster out of file: offset
>>>>> 0x%" PRIx64,
>>>>> +                        coffset);
>>>>> +                if (ret < 0) {
>>>>> +                    goto fail;
>>>>> +                }
>>>>> +                continue;
>>>>> +            }
>>>>> +
>>>>>                ret = qcow2_inc_refcounts_imrt(bs, res,
>>>>>                                               refcount_table,
>>>>> refcount_table_size,
>>>>>                                               coffset, csize);
>>>>> @@ -1748,6 +1770,16 @@ static int
>>>>> check_refcounts_l2(BlockDriverState *bs, BdrvCheckResult *res,
>>>>>            {
>>>>>                uint64_t offset = l2_entry & L2E_OFFSET_MASK;
>>>>>    +            if (offset >= bdrv_getlength(bs->file->bs)) {
>>>>> +                ret = fix_l2_entry_to_zero(
>>>>> +                        bs, res, fix, l2_offset, i, active,
>>>>> +                        "cluster out of file: offset 0x%" PRIx64,
>>>>> offset);
>>>>> +                if (ret < 0) {
>>>>> +                    goto fail;
>>>>> +                }
>>>>> +                continue;
>>>>> +            }
>>>>> +
>>>>>                if (flags & CHECK_FRAG_INFO) {
>>>>>                    res->bfi.allocated_clusters++;
>>>>>                    if (next_contiguous_offset &&
>>>>
>>>> hmm, interesting question here: in case of misaligned l2 entry, we
>>>> zero it out only for QCOW2_CLUSTER_ZERO_ALLOC, but not for normal
>>>> clusters? Why? I think it is ok to mark as zero misaligned normal
>>>> cluster l2 entry, otherwise we'll have fatal corruption on any
>>>> operation to this cluster.
> 
> Because for zero clusters the solution is clear.  We just throw away the
> obviously wrong preallocation information, but the cluster data stays
> the same (zero).  So there is no data loss.
> 
> For normal clusters, you definitely destroy the data by zeroing them out.
> 
>>> or we can just align them down.
> 
> Which would destroy the data as well.
> 
> You can argue that if the value is misaligned, it is extremely likely to
> be just garbage as a whole, though.  But in any case, it is not obvious
> what to do and always means data loss (which is different from zero
> clusters, where you can just keep them zero).
> 
> The clearest and most obvious solution would be to allocate a new
> cluster and copy the unaligned data there.  Maybe that doesn't make
> sense because the data is probably garbage anyway, but it definitely
> won't harm.


but what to copy? I think, it is mostly impossible that there is a misaligned
data cluster. More probable is just partly wrong l2 entry. So, in your way we
will lose this data (as we lose l2 entry, our last hope). Finally, what to do with
misaligned cluster on check? We definitely should do something, as trying to
access such cluster corrupts qcow2 in qemu.

What about an additional flag like "-align-misaligned-clusters-down"?

> 
>> and why do we calculate refcounts for corrupted l2 entry? Is it correct,
>> to consider data range referenced by this entry, if we'll never success
>> in writing or reading this data?
> 
> It's definitely better to mark something wrongly as referenced than
> wrongly as free.
> 
> The only difference it makes is that maybe we could save some space, but
> if there are any such corruptions, saving space really is the least of
> the users issues.
> 
> MAx
> 


-- 
Best regards,
Vladimir

Re: [Qemu-devel] [PATCH v2 7/7] block/qcow2-refcount: fix out-of-file L2 entries to be read-as-zero

Posted by Max Reitz 7 years, 1 month ago

On 12.12.18 09:36, Vladimir Sementsov-Ogievskiy wrote:
> 13.10.2018 15:58, Max Reitz wrote:
>> On 10.10.18 18:59, Vladimir Sementsov-Ogievskiy wrote:
>>> 10.10.2018 19:55, Vladimir Sementsov-Ogievskiy wrote:
>>>> 10.10.2018 19:39, Vladimir Sementsov-Ogievskiy wrote:
>>>>> 17.08.2018 15:22, Vladimir Sementsov-Ogievskiy wrote:
>>>>>> Rewrite corrupted L2 table entry, which reference space out of
>>>>>> underlying file.
>>>>>>
>>>>>> Make this L2 table entry read-as-all-zeros without any allocation.
>>>>>>
>>>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>>>>> ---
>>>>>>    block/qcow2-refcount.c | 32 ++++++++++++++++++++++++++++++++
>>>>>>    1 file changed, 32 insertions(+)
>>>>>>
>>>>>> diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
>>>>>> index 3c004e5bfe..3de3768a3c 100644
>>>>>> --- a/block/qcow2-refcount.c
>>>>>> +++ b/block/qcow2-refcount.c
>>>>>> @@ -1720,8 +1720,30 @@ static int
>>>>>> check_refcounts_l2(BlockDriverState *bs, BdrvCheckResult *res,
>>>>>>                /* Mark cluster as used */
>>>>>>                csize = (((l2_entry >> s->csize_shift) &
>>>>>> s->csize_mask) + 1) *
>>>>>>                        BDRV_SECTOR_SIZE;
>>>>>> +            if (csize > s->cluster_size) {
>>>>>> +                ret = fix_l2_entry_to_zero(
>>>>>> +                        bs, res, fix, l2_offset, i, active,
>>>>>> +                        "compressed cluster larger than cluster:
>>>>>> size 0x%"
>>>>>> +                        PRIx64, csize);
>>>>>> +                if (ret < 0) {
>>>>>> +                    goto fail;
>>>>>> +                }
>>>>>> +                continue;
>>>>>> +            }
>>>>>> +
>>>>>>                coffset = l2_entry & s->cluster_offset_mask &
>>>>>>                          ~(BDRV_SECTOR_SIZE - 1);
>>>>>> +            if (coffset >= bdrv_getlength(bs->file->bs)) {
>>>>>> +                ret = fix_l2_entry_to_zero(
>>>>>> +                        bs, res, fix, l2_offset, i, active,
>>>>>> +                        "compressed cluster out of file: offset
>>>>>> 0x%" PRIx64,
>>>>>> +                        coffset);
>>>>>> +                if (ret < 0) {
>>>>>> +                    goto fail;
>>>>>> +                }
>>>>>> +                continue;
>>>>>> +            }
>>>>>> +
>>>>>>                ret = qcow2_inc_refcounts_imrt(bs, res,
>>>>>>                                               refcount_table,
>>>>>> refcount_table_size,
>>>>>>                                               coffset, csize);
>>>>>> @@ -1748,6 +1770,16 @@ static int
>>>>>> check_refcounts_l2(BlockDriverState *bs, BdrvCheckResult *res,
>>>>>>            {
>>>>>>                uint64_t offset = l2_entry & L2E_OFFSET_MASK;
>>>>>>    +            if (offset >= bdrv_getlength(bs->file->bs)) {
>>>>>> +                ret = fix_l2_entry_to_zero(
>>>>>> +                        bs, res, fix, l2_offset, i, active,
>>>>>> +                        "cluster out of file: offset 0x%" PRIx64,
>>>>>> offset);
>>>>>> +                if (ret < 0) {
>>>>>> +                    goto fail;
>>>>>> +                }
>>>>>> +                continue;
>>>>>> +            }
>>>>>> +
>>>>>>                if (flags & CHECK_FRAG_INFO) {
>>>>>>                    res->bfi.allocated_clusters++;
>>>>>>                    if (next_contiguous_offset &&
>>>>>
>>>>> hmm, interesting question here: in case of misaligned l2 entry, we
>>>>> zero it out only for QCOW2_CLUSTER_ZERO_ALLOC, but not for normal
>>>>> clusters? Why? I think it is ok to mark as zero misaligned normal
>>>>> cluster l2 entry, otherwise we'll have fatal corruption on any
>>>>> operation to this cluster.
>>
>> Because for zero clusters the solution is clear.  We just throw away the
>> obviously wrong preallocation information, but the cluster data stays
>> the same (zero).  So there is no data loss.
>>
>> For normal clusters, you definitely destroy the data by zeroing them out.
>>
>>>> or we can just align them down.
>>
>> Which would destroy the data as well.
>>
>> You can argue that if the value is misaligned, it is extremely likely to
>> be just garbage as a whole, though.  But in any case, it is not obvious
>> what to do and always means data loss (which is different from zero
>> clusters, where you can just keep them zero).
>>
>> The clearest and most obvious solution would be to allocate a new
>> cluster and copy the unaligned data there.  Maybe that doesn't make
>> sense because the data is probably garbage anyway, but it definitely
>> won't harm.
> 
> 
> but what to copy? I think, it is mostly impossible that there is a misaligned
> data cluster. More probable is just partly wrong l2 entry.

What do you mean by "partly"?  I think having eight bytes "partly" wrong
is not very probable either.

I do agree that it's more likely that the L2 information is just garbage
than that the cluster base really is misaligned.  But I think it would
be garbage as a whole.

> So, in your way we will lose this data (as we lose l2 entry, our last hope).

So you think we should set the zero bit and leave the rest of the
cluster as it is?  But the resulting image would not be correct (because
the preallocation offset is wrong), so I don't see that as a good way of
repairing.

On one hand I think we want some repair option to explicitly acknowledge
data loss.  Like invalid bitmaps being removed or invalid L2 entries
being set to some value that is valid.

On the other, I would imagine that one usually runs qemu-img check
without -r on a broken image first to see what's up; at least if they
intent to have a deep look into it at all.  I think people should be
aware that -r all may destroy these kinds of leads.

But in any case, since I think the chances of the L2 entry only being
partly wrong are very small, I think it doesn't bring much to keep that
data around anyway.  I only find it useful in finding out why the
corruption occurred in the first place (by seeing what kind of data it
was overwritten with).

> Finally, what to do with
> misaligned cluster on check? We definitely should do something, as trying to
> access such cluster corrupts qcow2 in qemu.

Well, I gave a description of what I think should be done; which is to
allocate a new cluster, copy the unaligned data there, and then make the
entry point to that new cluster.

> What about an additional flag like "-align-misaligned-clusters-down"?

It would probably make more sense to add flags to the qemu-img check
infrastructure than adding a new -r mode, yes.

Max

[Qemu-devel] [PATCH v2 1/7] block/qcow2-refcount: fix check_oflag_copied
[Qemu-devel] [PATCH v2 2/7] block/qcow2-refcount: avoid eating RAM
[Qemu-devel] [PATCH v2 3/7] block/qcow2-refcount: check_refcounts_l2: refactor compressed case
[Qemu-devel] [PATCH v2 4/7] block/qcow2-refcount: check_refcounts_l2: reduce ignored overlaps
[Qemu-devel] [PATCH v2 5/7] block/qcow2-refcount: check_refcounts_l2: split fix_l2_entry_to_zero
[Qemu-devel] [PATCH v2 6/7] block/qcow2-refcount: fix out-of-file L1 entries to be zero
[Qemu-devel] [PATCH v2 7/7] block/qcow2-refcount: fix out-of-file L2 entries to be read-as-zero