[v5] parallels: Refactor the code of images checks and fix a bug

[PATCH v5 2/9] parallels: Fix data_end field value in parallels_co_check()

Posted by Alexander Ivanov 3 years, 5 months ago

Make data_end pointing to the end of the last cluster if a leak was fixed.
Otherwise set the file size to data_end.

Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
---
 block/parallels.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/block/parallels.c b/block/parallels.c
index c245ca35cd..2be56871bc 100644
--- a/block/parallels.c
+++ b/block/parallels.c
@@ -513,7 +513,15 @@ static int coroutine_fn parallels_co_check(BlockDriverState *bs,
             res->leaks_fixed += count;
         }
     }
-
+    /*
+     * If res->image_end_offset less than the file size,
+     * a leak was fixed and we should set the new offset to s->data_end.
+     * Otherwise set the file size to s->data_end.
+     */
+    if (res->image_end_offset < size && fix & BDRV_FIX_LEAKS) {
+        size = res->image_end_offset;
+    }
+    s->data_end = size >> BDRV_SECTOR_BITS;
 out:
     qemu_co_mutex_unlock(&s->lock);
     return ret;
-- 
2.34.1

Re: [PATCH v5 2/9] parallels: Fix data_end field value in parallels_co_check()

Posted by Denis V. Lunev 3 years, 5 months ago

On 22.08.2022 11:05, Alexander Ivanov wrote:
> Make data_end pointing to the end of the last cluster if a leak was fixed.
> Otherwise set the file size to data_end.
>
> Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
> ---
>   block/parallels.c | 10 +++++++++-
>   1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/block/parallels.c b/block/parallels.c
> index c245ca35cd..2be56871bc 100644
> --- a/block/parallels.c
> +++ b/block/parallels.c
> @@ -513,7 +513,15 @@ static int coroutine_fn parallels_co_check(BlockDriverState *bs,
>               res->leaks_fixed += count;
>           }
>       }
> -
> +    /*
> +     * If res->image_end_offset less than the file size,
> +     * a leak was fixed and we should set the new offset to s->data_end.
> +     * Otherwise set the file size to s->data_end.
> +     */
> +    if (res->image_end_offset < size && fix & BDRV_FIX_LEAKS) {
technically this is correct, but please add braces around
(fix & BDRV_FIX_LEAKS) - the code would look better

> +        size = res->image_end_offset;
> +    }
> +    s->data_end = size >> BDRV_SECTOR_BITS;
>   out:
>       qemu_co_mutex_unlock(&s->lock);
>       return ret;

Re: [PATCH v5 2/9] parallels: Fix data_end field value in parallels_co_check()

Posted by Vladimir Sementsov-Ogievskiy 3 years, 5 months ago

On 8/22/22 12:05, Alexander Ivanov wrote:
> Make data_end pointing to the end of the last cluster if a leak was fixed.
> Otherwise set the file size to data_end.
> 
> Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
> ---
>   block/parallels.c | 10 +++++++++-
>   1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/block/parallels.c b/block/parallels.c
> index c245ca35cd..2be56871bc 100644
> --- a/block/parallels.c
> +++ b/block/parallels.c
> @@ -513,7 +513,15 @@ static int coroutine_fn parallels_co_check(BlockDriverState *bs,
>               res->leaks_fixed += count;
>           }
>       }
> -
> +    /*
> +     * If res->image_end_offset less than the file size,
> +     * a leak was fixed and we should set the new offset to s->data_end.
> +     * Otherwise set the file size to s->data_end.

I'm not sure about English :) For me "set A to B" means A := B, but you use it visa versa..

> +     */
> +    if (res->image_end_offset < size && fix & BDRV_FIX_LEAKS) {
> +        size = res->image_end_offset;
> +    }
> +    s->data_end = size >> BDRV_SECTOR_BITS;

Hmm, actually, when size < data_end, you can shrink data_end only if (fix & BDRV_FIX_ERRORS), otherwise BAT is still not fixed.

>   out:
>       qemu_co_mutex_unlock(&s->lock);
>       return ret;


Actually I think we only need to modify s->data_end after successful BAT fixing above. Then, bdrv_truncate is formally unrelated to s->data_end and shouldn't touch it.

So, I think, more correct is something like

diff --git a/block/parallels.c b/block/parallels.c
index 2be56871bc..b268788974 100644
--- a/block/parallels.c
+++ b/block/parallels.c
@@ -479,20 +479,24 @@ static int coroutine_fn parallels_co_check(BlockDriverState *bs,
          prev_off = off;
      }
  
      ret = 0;
      if (flush_bat) {
          ret = bdrv_co_pwrite_sync(bs->file, 0, s->header_size, s->header, 0);
          if (ret < 0) {
              res->check_errors++;
              goto out;
          }
+
+        /* We have dropped some wrong clusters, update data_end */
+        assert(s->data_end * BDRV_SECTOR_SIZE >= high_off + s->cluster_size);
+        s->data_end = (high_off + s->cluster_size) / BDRV_SECTOR_SIZE;
      }
  
      res->image_end_offset = high_off + s->cluster_size;
      if (size > res->image_end_offset) {




-- 
Best regards,
Vladimir

Re: [PATCH v5 2/9] parallels: Fix data_end field value in parallels_co_check()

Posted by Alexander Ivanov 3 years, 5 months ago

On 23.08.2022 09:38, Vladimir Sementsov-Ogievskiy wrote:
> On 8/22/22 12:05, Alexander Ivanov wrote:
>> Make data_end pointing to the end of the last cluster if a leak was 
>> fixed.
>> Otherwise set the file size to data_end.
>>
>> Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>
>> ---
>>   block/parallels.c | 10 +++++++++-
>>   1 file changed, 9 insertions(+), 1 deletion(-)
>>
>> diff --git a/block/parallels.c b/block/parallels.c
>> index c245ca35cd..2be56871bc 100644
>> --- a/block/parallels.c
>> +++ b/block/parallels.c
>> @@ -513,7 +513,15 @@ static int coroutine_fn 
>> parallels_co_check(BlockDriverState *bs,
>>               res->leaks_fixed += count;
>>           }
>>       }
>> -
>> +    /*
>> +     * If res->image_end_offset less than the file size,
>> +     * a leak was fixed and we should set the new offset to 
>> s->data_end.
>> +     * Otherwise set the file size to s->data_end.
>
> I'm not sure about English :) For me "set A to B" means A := B, but 
> you use it visa versa..
I hesitated about this. I saw both variants and used the incorrect one 
probably. =)
>
>> +     */
>> +    if (res->image_end_offset < size && fix & BDRV_FIX_LEAKS) {
>> +        size = res->image_end_offset;
>> +    }
>> +    s->data_end = size >> BDRV_SECTOR_BITS;
>
> Hmm, actually, when size < data_end, you can shrink data_end only if 
> (fix & BDRV_FIX_ERRORS), otherwise BAT is still not fixed.
We shrink an image if there is a leak and (fix & BDRV_FIX_LEAKS). 
Otherwise we set data_end to the file size. In such a way we don't 
change the file size in parallels_close() even if we have out-of-image 
offsets in BAT.
>
>>   out:
>>       qemu_co_mutex_unlock(&s->lock);
>>       return ret;
>
>
> Actually I think we only need to modify s->data_end after successful 
> BAT fixing above. Then, bdrv_truncate is formally unrelated to 
> s->data_end and shouldn't touch it.
>
> So, I think, more correct is something like
>
> diff --git a/block/parallels.c b/block/parallels.c
> index 2be56871bc..b268788974 100644
> --- a/block/parallels.c
> +++ b/block/parallels.c
> @@ -479,20 +479,24 @@ static int coroutine_fn 
> parallels_co_check(BlockDriverState *bs,
>          prev_off = off;
>      }
>
>      ret = 0;
>      if (flush_bat) {
>          ret = bdrv_co_pwrite_sync(bs->file, 0, s->header_size, 
> s->header, 0);
>          if (ret < 0) {
>              res->check_errors++;
>              goto out;
>          }
> +
> +        /* We have dropped some wrong clusters, update data_end */
> +        assert(s->data_end * BDRV_SECTOR_SIZE >= high_off + 
> s->cluster_size);
> +        s->data_end = (high_off + s->cluster_size) / BDRV_SECTOR_SIZE;
>      }
>
>      res->image_end_offset = high_off + s->cluster_size;
>      if (size > res->image_end_offset) {
>
If an image was opened in RW mode for checking without fixing (I don't 
know if it's possible), data_end can be set outside the image and the 
image will be expanded in parallels_close().

OK, I would propose to fix data_end in two places:

1) after out-of-image check set data_end to the highest offset with (off 
+ cluster_size <= file_size) condition, independent on fix flags;

2) after leak check, only if (size > res->image_end_offset).

In such a way we can have only one inconsistence after any check: 
out-of-image offsets isn't fixed and we have offsets after data_end. But 
it is much better than if we set data_end to petabytes, for example.