[PATCH 1/2] accel: kvm: Fix memory waste under mismatch page size

Keqian Zhu posted 2 patches 4 years, 11 months ago
Maintainers: Paolo Bonzini <pbonzini@redhat.com>
There is a newer version of this series
[PATCH 1/2] accel: kvm: Fix memory waste under mismatch page size
Posted by Keqian Zhu 4 years, 11 months ago
When handle dirty log, we face qemu_real_host_page_size and
TARGET_PAGE_SIZE. The first one is the granule of KVM dirty
bitmap, and the second one is the granule of QEMU dirty bitmap.

Generally speaking, qemu_real_host_page_size >= TARGET_PAGE_SIZE,
so misuse TARGET_PAGE_SIZE to init kvmslot dirty_bmap may waste
memory. For example, when qemu_real_host_page_size is 64K and
TARGET_PAGE_SIZE is 4K, this bugfix can save 93.75% memory.

Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
---
 accel/kvm/kvm-all.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index baaa54249d..c5e06288eb 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -620,8 +620,12 @@ static void kvm_memslot_init_dirty_bitmap(KVMSlot *mem)
      * too, in most cases).
      * So for now, let's align to 64 instead of HOST_LONG_BITS here, in
      * a hope that sizeof(long) won't become >8 any time soon.
+     *
+     * Note: the granule of kvm dirty log is qemu_real_host_page_size.
+     * And mem->memory_size is aligned to it (otherwise this mem can't
+     * be registered to KVM).
      */
-    hwaddr bitmap_size = ALIGN(((mem->memory_size) >> TARGET_PAGE_BITS),
+    hwaddr bitmap_size = ALIGN(mem->memory_size / qemu_real_host_page_size,
                                         /*HOST_LONG_BITS*/ 64) / 8;
     mem->dirty_bmap = g_malloc0(bitmap_size);
 }
-- 
2.23.0


Re: [PATCH 1/2] accel: kvm: Fix memory waste under mismatch page size
Posted by Peter Xu 4 years, 11 months ago
On Tue, Dec 15, 2020 at 03:19:47PM +0800, Keqian Zhu wrote:
> When handle dirty log, we face qemu_real_host_page_size and
> TARGET_PAGE_SIZE. The first one is the granule of KVM dirty
> bitmap, and the second one is the granule of QEMU dirty bitmap.
> 
> Generally speaking, qemu_real_host_page_size >= TARGET_PAGE_SIZE,
> so misuse TARGET_PAGE_SIZE to init kvmslot dirty_bmap may waste
> memory. For example, when qemu_real_host_page_size is 64K and
> TARGET_PAGE_SIZE is 4K, this bugfix can save 93.75% memory.
> 
> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> ---
>  accel/kvm/kvm-all.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> index baaa54249d..c5e06288eb 100644
> --- a/accel/kvm/kvm-all.c
> +++ b/accel/kvm/kvm-all.c
> @@ -620,8 +620,12 @@ static void kvm_memslot_init_dirty_bitmap(KVMSlot *mem)
>       * too, in most cases).
>       * So for now, let's align to 64 instead of HOST_LONG_BITS here, in
>       * a hope that sizeof(long) won't become >8 any time soon.
> +     *
> +     * Note: the granule of kvm dirty log is qemu_real_host_page_size.
> +     * And mem->memory_size is aligned to it (otherwise this mem can't
> +     * be registered to KVM).
>       */
> -    hwaddr bitmap_size = ALIGN(((mem->memory_size) >> TARGET_PAGE_BITS),
> +    hwaddr bitmap_size = ALIGN(mem->memory_size / qemu_real_host_page_size,
>                                          /*HOST_LONG_BITS*/ 64) / 8;

Yes I think this is correct.  Thanks.

So one thing I failed to notice is cpu_physical_memory_set_dirty_lebitmap()
will "amplify" the host dirty pages into guest ones; seems we're all good then.

Reviewed-by: Peter Xu <peterx@redhat.com>

-- 
Peter Xu


Re: [PATCH 1/2] accel: kvm: Fix memory waste under mismatch page size
Posted by Keqian Zhu 4 years, 11 months ago
Hi Peter,

On 2020/12/16 1:57, Peter Xu wrote:
> On Tue, Dec 15, 2020 at 03:19:47PM +0800, Keqian Zhu wrote:
>> When handle dirty log, we face qemu_real_host_page_size and
>> TARGET_PAGE_SIZE. The first one is the granule of KVM dirty
>> bitmap, and the second one is the granule of QEMU dirty bitmap.
>>
>> Generally speaking, qemu_real_host_page_size >= TARGET_PAGE_SIZE,
>> so misuse TARGET_PAGE_SIZE to init kvmslot dirty_bmap may waste
>> memory. For example, when qemu_real_host_page_size is 64K and
>> TARGET_PAGE_SIZE is 4K, this bugfix can save 93.75% memory.
>>
>> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
>> ---
>>  accel/kvm/kvm-all.c | 6 +++++-
>>  1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
>> index baaa54249d..c5e06288eb 100644
>> --- a/accel/kvm/kvm-all.c
>> +++ b/accel/kvm/kvm-all.c
>> @@ -620,8 +620,12 @@ static void kvm_memslot_init_dirty_bitmap(KVMSlot *mem)
>>       * too, in most cases).
>>       * So for now, let's align to 64 instead of HOST_LONG_BITS here, in
>>       * a hope that sizeof(long) won't become >8 any time soon.
>> +     *
>> +     * Note: the granule of kvm dirty log is qemu_real_host_page_size.
>> +     * And mem->memory_size is aligned to it (otherwise this mem can't
>> +     * be registered to KVM).
>>       */
>> -    hwaddr bitmap_size = ALIGN(((mem->memory_size) >> TARGET_PAGE_BITS),
>> +    hwaddr bitmap_size = ALIGN(mem->memory_size / qemu_real_host_page_size,
>>                                          /*HOST_LONG_BITS*/ 64) / 8;
> 
> Yes I think this is correct.  Thanks.
> 
> So one thing I failed to notice is cpu_physical_memory_set_dirty_lebitmap()
> will "amplify" the host dirty pages into guest ones; seems we're all good then.
> 
> Reviewed-by: Peter Xu <peterx@redhat.com>
> 
OK Thanks :-)

One more thing, we should consider whether @start and @size is psize aligned (my second
patch). Do you agree to add assert on them directly?


Thanks,
Keqian

Re: [PATCH 1/2] accel: kvm: Fix memory waste under mismatch page size
Posted by Peter Xu 4 years, 11 months ago
On Wed, Dec 16, 2020 at 04:21:17PM +0800, Keqian Zhu wrote:
> One more thing, we should consider whether @start and @size is psize aligned (my second
> patch). Do you agree to add assert on them directly?

Yes I think the 2nd patch is okay, but please address Drew's comments.

Returning -EINVAL is the same as abort() currently - it'll just abort() at
kvm_log_clear() instead.

Thanks,

-- 
Peter Xu


Re: [PATCH 1/2] accel: kvm: Fix memory waste under mismatch page size
Posted by Keqian Zhu 4 years, 11 months ago

On 2020/12/17 4:48, Peter Xu wrote:
> On Wed, Dec 16, 2020 at 04:21:17PM +0800, Keqian Zhu wrote:
>> One more thing, we should consider whether @start and @size is psize aligned (my second
>> patch). Do you agree to add assert on them directly?
> 
> Yes I think the 2nd patch is okay, but please address Drew's comments.
> 
> Returning -EINVAL is the same as abort() currently - it'll just abort() at
> kvm_log_clear() instead.
OK, I will send v2 soon.

Thanks,
Keqian

> 
> Thanks,
> 

Re: [PATCH 1/2] accel: kvm: Fix memory waste under mismatch page size
Posted by Andrew Jones 4 years, 11 months ago
On Tue, Dec 15, 2020 at 03:19:47PM +0800, Keqian Zhu wrote:
> When handle dirty log, we face qemu_real_host_page_size and
> TARGET_PAGE_SIZE. The first one is the granule of KVM dirty
> bitmap, and the second one is the granule of QEMU dirty bitmap.
> 
> Generally speaking, qemu_real_host_page_size >= TARGET_PAGE_SIZE,

Not just generally speaking, but must be. We have

    /*
     * On systems where the kernel can support different base page
     * sizes, host page size may be different from TARGET_PAGE_SIZE,
     * even with KVM.  TARGET_PAGE_SIZE is assumed to be the minimum
     * page size for the system though.
     */
    assert(TARGET_PAGE_SIZE <= qemu_real_host_page_size);

at the top of kvm_init() to enforce it.

The comment also says TARGET_PAGE_SIZE is assumed to be the minimum,
which is more of a requirement than assumption. And, that requirement
implies that we require all memory regions and base addresses to be
aligned to the maximum possible target page size (in case the target
actually uses something larger).

Please remove 'Generally speaking' from the commit message. It
implies this change is based on an assumption rather than a rule.

(It'd be nice if we had these guest memory and TARGET_PAGE_SIZE
requirements better documented and in one place.)

> so misuse TARGET_PAGE_SIZE to init kvmslot dirty_bmap may waste
> memory. For example, when qemu_real_host_page_size is 64K and
> TARGET_PAGE_SIZE is 4K, this bugfix can save 93.75% memory.
> 
> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
> ---
>  accel/kvm/kvm-all.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> index baaa54249d..c5e06288eb 100644
> --- a/accel/kvm/kvm-all.c
> +++ b/accel/kvm/kvm-all.c
> @@ -620,8 +620,12 @@ static void kvm_memslot_init_dirty_bitmap(KVMSlot *mem)
>       * too, in most cases).
>       * So for now, let's align to 64 instead of HOST_LONG_BITS here, in
>       * a hope that sizeof(long) won't become >8 any time soon.
> +     *
> +     * Note: the granule of kvm dirty log is qemu_real_host_page_size.
> +     * And mem->memory_size is aligned to it (otherwise this mem can't
> +     * be registered to KVM).
>       */
> -    hwaddr bitmap_size = ALIGN(((mem->memory_size) >> TARGET_PAGE_BITS),
> +    hwaddr bitmap_size = ALIGN(mem->memory_size / qemu_real_host_page_size,
>                                          /*HOST_LONG_BITS*/ 64) / 8;
>      mem->dirty_bmap = g_malloc0(bitmap_size);
>  }
> -- 
> 2.23.0
> 
>

Besides the commit message

Reviewed-by: Andrew Jones <drjones@redhat.com>


Thanks,
drew


Re: [PATCH 1/2] accel: kvm: Fix memory waste under mismatch page size
Posted by Keqian Zhu 4 years, 11 months ago
Hi drew,

On 2020/12/15 19:20, Andrew Jones wrote:
> On Tue, Dec 15, 2020 at 03:19:47PM +0800, Keqian Zhu wrote:
>> When handle dirty log, we face qemu_real_host_page_size and
>> TARGET_PAGE_SIZE. The first one is the granule of KVM dirty
>> bitmap, and the second one is the granule of QEMU dirty bitmap.
>>
>> Generally speaking, qemu_real_host_page_size >= TARGET_PAGE_SIZE,
> 
> Not just generally speaking, but must be. We have
> 
>     /*
>      * On systems where the kernel can support different base page
>      * sizes, host page size may be different from TARGET_PAGE_SIZE,
>      * even with KVM.  TARGET_PAGE_SIZE is assumed to be the minimum
>      * page size for the system though.
>      */
>     assert(TARGET_PAGE_SIZE <= qemu_real_host_page_size);
Yes, I noticed it, but my statement (Generally speaking) is not suitable.
Thanks for pointing it out.

> 
> at the top of kvm_init() to enforce it.
> 
> The comment also says TARGET_PAGE_SIZE is assumed to be the minimum,
> which is more of a requirement than assumption. And, that requirement
> implies that we require all memory regions and base addresses to be
> aligned to the maximum possible target page size (in case the target
> actually uses something larger).
> 
> Please remove 'Generally speaking' from the commit message. It
> implies this change is based on an assumption rather than a rule.
> 
> (It'd be nice if we had these guest memory and TARGET_PAGE_SIZE
> requirements better documented and in one place.)
Maybe someone could do this :-)

> 
>> so misuse TARGET_PAGE_SIZE to init kvmslot dirty_bmap may waste
>> memory. For example, when qemu_real_host_page_size is 64K and
>> TARGET_PAGE_SIZE is 4K, this bugfix can save 93.75% memory.
>>
>> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
>> ---
>>  accel/kvm/kvm-all.c | 6 +++++-
>>  1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
>> index baaa54249d..c5e06288eb 100644
>> --- a/accel/kvm/kvm-all.c
>> +++ b/accel/kvm/kvm-all.c
>> @@ -620,8 +620,12 @@ static void kvm_memslot_init_dirty_bitmap(KVMSlot *mem)
>>       * too, in most cases).
>>       * So for now, let's align to 64 instead of HOST_LONG_BITS here, in
>>       * a hope that sizeof(long) won't become >8 any time soon.
>> +     *
>> +     * Note: the granule of kvm dirty log is qemu_real_host_page_size.
>> +     * And mem->memory_size is aligned to it (otherwise this mem can't
>> +     * be registered to KVM).
>>       */
>> -    hwaddr bitmap_size = ALIGN(((mem->memory_size) >> TARGET_PAGE_BITS),
>> +    hwaddr bitmap_size = ALIGN(mem->memory_size / qemu_real_host_page_size,
>>                                          /*HOST_LONG_BITS*/ 64) / 8;
>>      mem->dirty_bmap = g_malloc0(bitmap_size);
>>  }
>> -- 
>> 2.23.0
>>
>>
> 
> Besides the commit message
> 
> Reviewed-by: Andrew Jones <drjones@redhat.com>
> 
OK, I will correct it and send v2 soon.

Cheers,
Keqian
> 
> Thanks,
> drew
> 
> .
>