[PATCH] qcow2-cluster: Fix integer left shift error in qcow2_alloc_cluster_link_l2()

Tuguoyi posted 1 patch 3 years, 8 months ago
Failed in applying to current master (apply log)
block/qcow2-cluster.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
[PATCH] qcow2-cluster: Fix integer left shift error in qcow2_alloc_cluster_link_l2()
Posted by Tuguoyi 3 years, 8 months ago
When calculating the offset, the result of left shift operation will be promoted
to type int64 automatically because the left operand of + operator is uint64_t.
but the result after integer promotion may be produce an error value for us and
trigger the following asserting error.

For example, consider i=0x2000, cluster_bits=18, the result of left shift
operation will be 0x80000000. Cause argument i is of signed integer type,
the result is automatically promoted to 0xffffffff80000000 which is not
we expected

The way to trigger the assertion error:
  qemu-img create -f qcow2 -o preallocation=full,cluster_size=256k tmpdisk 10G

This patch fix it by casting @i to uint64_t before doing left shift operation

Signed-off-by: Guoyi Tu <tu.guoyi@h3c.com>
---
 block/qcow2-cluster.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index a677ba9..550850b 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -980,7 +980,7 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
 
     assert(l2_index + m->nb_clusters <= s->l2_slice_size);
     for (i = 0; i < m->nb_clusters; i++) {
-        uint64_t offset = cluster_offset + (i << s->cluster_bits);
+        uint64_t offset = cluster_offset + ((uint64_t)i << s->cluster_bits);
         /* if two concurrent writes happen to the same unallocated cluster
          * each write allocates separate cluster and writes data concurrently.
          * The first one to complete updates l2 table with pointer to its
-- 
2.7.4

--
Best regards,
Guoyi

Re: [PATCH for-5.1?] qcow2-cluster: Fix integer left shift error in qcow2_alloc_cluster_link_l2()
Posted by Eric Blake 3 years, 8 months ago
On 8/5/20 4:22 AM, Tuguoyi wrote:
> When calculating the offset, the result of left shift operation will be promoted
> to type int64 automatically because the left operand of + operator is uint64_t.
> but the result after integer promotion may be produce an error value for us and
> trigger the following asserting error.
> 
> For example, consider i=0x2000, cluster_bits=18, the result of left shift
> operation will be 0x80000000. Cause argument i is of signed integer type,
> the result is automatically promoted to 0xffffffff80000000 which is not
> we expected
> 
> The way to trigger the assertion error:
>    qemu-img create -f qcow2 -o preallocation=full,cluster_size=256k tmpdisk 10G

I wonder if it is worth an iotest addition to cover this.

> 
> This patch fix it by casting @i to uint64_t before doing left shift operation
> 
> Signed-off-by: Guoyi Tu <tu.guoyi@h3c.com>
> ---
>   block/qcow2-cluster.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)

While this appears to be long-standing, rather than a regression in 5.1, 
this would be worth getting into -rc3 today if we still have time (if 
not, slipping to 5.2 is okay).

> 
> diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
> index a677ba9..550850b 100644
> --- a/block/qcow2-cluster.c
> +++ b/block/qcow2-cluster.c
> @@ -980,7 +980,7 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
>   
>       assert(l2_index + m->nb_clusters <= s->l2_slice_size);
>       for (i = 0; i < m->nb_clusters; i++) {
> -        uint64_t offset = cluster_offset + (i << s->cluster_bits);
> +        uint64_t offset = cluster_offset + ((uint64_t)i << s->cluster_bits);

We have:
offset = uint64_t + (int << int)

which is indeed a cause of unwanted sign extension.

If I'm not mistaken, it would also be feasible to fix this by changing 
qcow2.h to fix typedef struct BDRVQcow2State to use an unsigned type for 
cluster_bits (the way we already do for struct QCowHeader), avoiding the 
need for a cast here.

But if we're trying to get this in today, rather than auditing all other 
uses of BDRVQcow2State for incorrect typing,

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org


Re: [PATCH] qcow2-cluster: Fix integer left shift error in qcow2_alloc_cluster_link_l2()
Posted by Kevin Wolf 3 years, 8 months ago
Am 05.08.2020 um 11:22 hat Tuguoyi geschrieben:
> When calculating the offset, the result of left shift operation will be promoted
> to type int64 automatically because the left operand of + operator is uint64_t.
> but the result after integer promotion may be produce an error value for us and
> trigger the following asserting error.
> 
> For example, consider i=0x2000, cluster_bits=18, the result of left shift
> operation will be 0x80000000. Cause argument i is of signed integer type,
> the result is automatically promoted to 0xffffffff80000000 which is not
> we expected
> 
> The way to trigger the assertion error:
>   qemu-img create -f qcow2 -o preallocation=full,cluster_size=256k tmpdisk 10G
> 
> This patch fix it by casting @i to uint64_t before doing left shift operation
> 
> Signed-off-by: Guoyi Tu <tu.guoyi@h3c.com>

Reviewed-by: Kevin Wolf <kwolf@redhat.com>


Re: [PATCH] qcow2-cluster: Fix integer left shift error in qcow2_alloc_cluster_link_l2()
Posted by Peter Maydell 3 years, 8 months ago
On Wed, 5 Aug 2020 at 10:24, Tuguoyi <tu.guoyi@h3c.com> wrote:
>
> When calculating the offset, the result of left shift operation will be promoted
> to type int64 automatically because the left operand of + operator is uint64_t.
> but the result after integer promotion may be produce an error value for us and
> trigger the following asserting error.
>
> For example, consider i=0x2000, cluster_bits=18, the result of left shift
> operation will be 0x80000000. Cause argument i is of signed integer type,
> the result is automatically promoted to 0xffffffff80000000 which is not
> we expected
>
> The way to trigger the assertion error:
>   qemu-img create -f qcow2 -o preallocation=full,cluster_size=256k tmpdisk 10G
>
> This patch fix it by casting @i to uint64_t before doing left shift operation
>
> Signed-off-by: Guoyi Tu <tu.guoyi@h3c.com>
> ---

Applied to master, thanks.

-- PMM

Re: [PATCH] qcow2-cluster: Fix integer left shift error in qcow2_alloc_cluster_link_l2()
Posted by Alberto Garcia 3 years, 8 months ago
On Wed 05 Aug 2020 11:22:58 AM CEST, Tuguoyi wrote:
> This patch fix it by casting @i to uint64_t before doing left shift
> operation

The patch seems fine and I also think that it's perhaps worth a test
case (although it only seems to happen with preallocation=falloc or full
so the test would need to generate very large files).

But I also wonder if there are other cases where this can happen.

nb_clusters is an int and there are more cases of

    nb_clusters << s->cluster_bits

I can see at least these: handle_alloc(), qcow2_free_any_clusters(),
qcow2_alloc_cluster_abort().

Berto

Re: [PATCH] qcow2-cluster: Fix integer left shift error in qcow2_alloc_cluster_link_l2()
Posted by Alberto Garcia 3 years, 8 months ago
On Wed 05 Aug 2020 03:44:08 PM CEST, Alberto Garcia wrote:
> On Wed 05 Aug 2020 11:22:58 AM CEST, Tuguoyi wrote:
>> This patch fix it by casting @i to uint64_t before doing left shift
>> operation
>
> The patch seems fine and I also think that it's perhaps worth a test
> case (although it only seems to happen with preallocation=falloc or full
> so the test would need to generate very large files).
>
> But I also wonder if there are other cases where this can happen.
>
> nb_clusters is an int and there are more cases of
>
>     nb_clusters << s->cluster_bits
>
> I can see at least these: handle_alloc(), qcow2_free_any_clusters(),
> qcow2_alloc_cluster_abort().

I forgot to add

Reviewed-by: Alberto Garcia <berto@igalia.com>

Berto

Re: [PATCH] qcow2-cluster: Fix integer left shift error in qcow2_alloc_cluster_link_l2()
Posted by Kevin Wolf 3 years, 8 months ago
Am 05.08.2020 um 15:44 hat Alberto Garcia geschrieben:
> On Wed 05 Aug 2020 11:22:58 AM CEST, Tuguoyi wrote:
> > This patch fix it by casting @i to uint64_t before doing left shift
> > operation
> 
> The patch seems fine and I also think that it's perhaps worth a test
> case (although it only seems to happen with preallocation=falloc or full
> so the test would need to generate very large files).
> 
> But I also wonder if there are other cases where this can happen.
> 
> nb_clusters is an int and there are more cases of
> 
>     nb_clusters << s->cluster_bits
> 
> I can see at least these: handle_alloc(), qcow2_free_any_clusters(),
> qcow2_alloc_cluster_abort().

Actuallyx, handle_alloc() and everything that comes from it should be
fine. It has a uint64_t nb_clusters locally and limits it:

    nb_clusters = MIN(nb_clusters, INT_MAX >> s->cluster_bits);

The problematic request that causes the crash comes actually from
qcow2_co_truncate(). It limits it only to s->l2_slice_size, which can be
larger than that, but will be at most 256k (= 2 MB / sizeof(uint64_t)).

cow_end.offset will get a wraparound then, too. This is harmless because
cow_end.nb_bytes = 0, so the offset will be ignored anyway.

I think the proper fix to be made in the 5.2 release cycle would revert
this one and instead fix the limit in qcow2_co_truncate().

But this one is good enough as a band-aid for 5.1.

Kevin


Re: [PATCH] qcow2-cluster: Fix integer left shift error in qcow2_alloc_cluster_link_l2()
Posted by Alberto Garcia 3 years, 8 months ago
On Wed 05 Aug 2020 04:16:57 PM CEST, Kevin Wolf wrote:
>> nb_clusters is an int and there are more cases of
>> 
>>     nb_clusters << s->cluster_bits
>> 
>> I can see at least these: handle_alloc(), qcow2_free_any_clusters(),
>> qcow2_alloc_cluster_abort().
>
> Actuallyx, handle_alloc() and everything that comes from it should be
> fine. It has a uint64_t nb_clusters locally and limits it:
>
>     nb_clusters = MIN(nb_clusters, INT_MAX >> s->cluster_bits);

INT_MAX replaced with BDRV_REQUEST_MAX_BYTES in my subcluster allocation
series, so it should still be fine.

> The problematic request that causes the crash comes actually from
> qcow2_co_truncate(). It limits it only to s->l2_slice_size, which can
> be larger than that, but will be at most 256k (= 2 MB /
> sizeof(uint64_t)).
>
> cow_end.offset will get a wraparound then, too. This is harmless
> because cow_end.nb_bytes = 0, so the offset will be ignored anyway.

In that one nb_clusters is actually int64_t so there's no wraparound.

> I think the proper fix to be made in the 5.2 release cycle would revert
> this one and instead fix the limit in qcow2_co_truncate().
>
> But this one is good enough as a band-aid for 5.1.

The other one is just as simple, no? This line in the while() loop in
qcow2_co_truncate():

   nb_clusters = MIN(nb_clusters, BDRV_REQUEST_MAX_BYTES >> s->cluster_bits);

Berto