[PATCH v7 0/4] s390: stop abusing memory_region_allocate_system_memory()

Igor Mammedov posted 4 patches 4 years, 6 months ago
Test docker-clang@ubuntu passed
Test checkpatch passed
Test docker-mingw@fedora passed
Test docker-quick@centos7 passed
Test asan passed
Test FreeBSD passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20190924144751.24149-1-imammedo@redhat.com
Maintainers: Cornelia Huck <cohuck@redhat.com>, Richard Henderson <rth@twiddle.net>, Halil Pasic <pasic@linux.ibm.com>, David Hildenbrand <david@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>, Christian Borntraeger <borntraeger@de.ibm.com>
include/sysemu/kvm_int.h   |   1 +
accel/kvm/kvm-all.c        | 238 +++++++++++++++++++++++--------------
hw/s390x/s390-virtio-ccw.c |  30 +----
target/s390x/kvm.c         |  11 ++
4 files changed, 161 insertions(+), 119 deletions(-)
[PATCH v7 0/4] s390: stop abusing memory_region_allocate_system_memory()
Posted by Igor Mammedov 4 years, 6 months ago
Changelog:
  since v6:
    - include and rebase on top of
       [PATCH 0/2] kvm: clear dirty bitmaps from all overlapping memslots
        https://www.mail-archive.com/qemu-devel@nongnu.org/msg646200.html
    - minor fixups suggested during v6 review
    - more testing incl. hacked x86
  since v5:
    - [1/2] fix migration that wasn't starting and make sure that KVM part
      is able to handle 1:n MemorySection:memslot arrangement
  since v3:
    - fix compilation issue
    - advance HVA along with GPA in kvm_set_phys_mem()
  since v2:
    - break migration from old QEMU (since 2.12-4.1) for guest with >8TB RAM
      and drop migratable aliases patch as was agreed during v2 review
    - drop 4.2 machines patch as it's not prerequisite anymore
  since v1:
    - include 4.2 machines patch for adding compat RAM layout on top
    - 2/4 add missing in v1 patch for splitting too big MemorySection on
          several memslots
    - 3/4 amend code path on alias destruction to ensure that RAMBlock is
          cleaned properly
    - 4/4 add compat machine code to keep old layout (migration-wise) for
          4.1 and older machines 


While looking into unifying guest RAM allocation to use hostmem backends
for initial RAM (especially when -mempath is used) and retiring
memory_region_allocate_system_memory() API, leaving only single hostmem backend,
I was inspecting how currently it is used by boards and it turns out several
boards abuse it by calling the function several times (despite documented contract
forbiding it).

s390 is one of such boards where KVM limitation on memslot size got propagated
to board design and memory_region_allocate_system_memory() was abused to satisfy
KVM requirement for max RAM chunk where memory region alias would suffice.

Unfortunately, memory_region_allocate_system_memory() usage created migration
dependency where guest RAM is transferred in migration stream as several RAMBlocks
if it's more than KVM_SLOT_MAX_BYTES. During v2 review it was agreed to ignore
migration breakage (documenting it in release notes) and leaving only KVM fix.

In order to replace these several RAM chunks with a single memdev and keep it
working with KVM memslot size limit, the later was modified to deal with 
memory section split on several KVMSlots and manual RAM splitting in s390
was replace by single memory_region_allocate_system_memory() call.

Tested:
  * s390 with hacked KVM_SLOT_MAX_BYTES = 128Mb
      - guest reboot cycle in ping-pong migration
  * x86 with hacke max memslot = 128 and manual_dirty_log_protect enabled
      - ping-pong migration with workload dirtying RAM around a split area



Igor Mammedov (2):
  kvm: split too big memory section on several memslots
  s390: do not call memory_region_allocate_system_memory() multiple
    times

Paolo Bonzini (2):
  kvm: extract kvm_log_clear_one_slot
  kvm: clear dirty bitmaps from all overlapping memslots

 include/sysemu/kvm_int.h   |   1 +
 accel/kvm/kvm-all.c        | 238 +++++++++++++++++++++++--------------
 hw/s390x/s390-virtio-ccw.c |  30 +----
 target/s390x/kvm.c         |  11 ++
 4 files changed, 161 insertions(+), 119 deletions(-)

-- 
2.18.1


Re: [PATCH v7 0/4] s390: stop abusing memory_region_allocate_system_memory()
Posted by Christian Borntraeger 4 years, 6 months ago
On 24.09.19 16:47, Igor Mammedov wrote:
> Changelog:
>   since v6:
>     - include and rebase on top of
>        [PATCH 0/2] kvm: clear dirty bitmaps from all overlapping memslots
>         https://www.mail-archive.com/qemu-devel@nongnu.org/msg646200.html
>     - minor fixups suggested during v6 review
>     - more testing incl. hacked x86
>   since v5:
>     - [1/2] fix migration that wasn't starting and make sure that KVM part
>       is able to handle 1:n MemorySection:memslot arrangement
>   since v3:
>     - fix compilation issue
>     - advance HVA along with GPA in kvm_set_phys_mem()
>   since v2:
>     - break migration from old QEMU (since 2.12-4.1) for guest with >8TB RAM
>       and drop migratable aliases patch as was agreed during v2 review
>     - drop 4.2 machines patch as it's not prerequisite anymore
>   since v1:
>     - include 4.2 machines patch for adding compat RAM layout on top
>     - 2/4 add missing in v1 patch for splitting too big MemorySection on
>           several memslots
>     - 3/4 amend code path on alias destruction to ensure that RAMBlock is
>           cleaned properly
>     - 4/4 add compat machine code to keep old layout (migration-wise) for
>           4.1 and older machines 
> 
> 
> While looking into unifying guest RAM allocation to use hostmem backends
> for initial RAM (especially when -mempath is used) and retiring
> memory_region_allocate_system_memory() API, leaving only single hostmem backend,
> I was inspecting how currently it is used by boards and it turns out several
> boards abuse it by calling the function several times (despite documented contract
> forbiding it).
> 
> s390 is one of such boards where KVM limitation on memslot size got propagated
> to board design and memory_region_allocate_system_memory() was abused to satisfy
> KVM requirement for max RAM chunk where memory region alias would suffice.
> 
> Unfortunately, memory_region_allocate_system_memory() usage created migration
> dependency where guest RAM is transferred in migration stream as several RAMBlocks
> if it's more than KVM_SLOT_MAX_BYTES. During v2 review it was agreed to ignore
> migration breakage (documenting it in release notes) and leaving only KVM fix.
> 
> In order to replace these several RAM chunks with a single memdev and keep it
> working with KVM memslot size limit, the later was modified to deal with 
> memory section split on several KVMSlots and manual RAM splitting in s390
> was replace by single memory_region_allocate_system_memory() call.
> 
> Tested:
>   * s390 with hacked KVM_SLOT_MAX_BYTES = 128Mb
>       - guest reboot cycle in ping-pong migration
>   * x86 with hacke max memslot = 128 and manual_dirty_log_protect enabled
>       - ping-pong migration with workload dirtying RAM around a split area
> 
> 
> 
> Igor Mammedov (2):
>   kvm: split too big memory section on several memslots
>   s390: do not call memory_region_allocate_system_memory() multiple
>     times
> 
> Paolo Bonzini (2):
>   kvm: extract kvm_log_clear_one_slot
>   kvm: clear dirty bitmaps from all overlapping memslots
> 
>  include/sysemu/kvm_int.h   |   1 +
>  accel/kvm/kvm-all.c        | 238 +++++++++++++++++++++++--------------
>  hw/s390x/s390-virtio-ccw.c |  30 +----
>  target/s390x/kvm.c         |  11 ++
>  4 files changed, 161 insertions(+), 119 deletions(-)
> 

Series
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com




FWIW, I think I would like to add something like the following later on.


Subject: [PATCH 1/1] s390/kvm: split kvm mem slots at 4TB

Instead of splitting at an unaligned address, we can simply split at
4TB.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 target/s390x/kvm.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/target/s390x/kvm.c b/target/s390x/kvm.c
index ad2dd14f7e78..611f56f4b5ac 100644
--- a/target/s390x/kvm.c
+++ b/target/s390x/kvm.c
@@ -126,12 +126,11 @@
 /*
  * KVM does only support memory slots up to KVM_MEM_MAX_NR_PAGES pages
  * as the dirty bitmap must be managed by bitops that take an int as
- * position indicator. If we have a guest beyond that we will split off
- * new subregions. The split must happen on a segment boundary (1MB).
+ * position indicator. This would end at an unaligned  address
+ * (0x7fffff00000). As future variants might provide larger pages
+ * and to make all addresses properly aligned, let us split at 4TB.
  */
-#define KVM_MEM_MAX_NR_PAGES ((1ULL << 31) - 1)
-#define SEG_MSK (~0xfffffULL)
-#define KVM_SLOT_MAX_BYTES ((KVM_MEM_MAX_NR_PAGES * TARGET_PAGE_SIZE) & SEG_MSK)
+#define KVM_SLOT_MAX_BYTES 4096UL*1024*1024*1024
 
 static CPUWatchpoint hw_watchpoint;
 /*
-- 
2.21.0


Re: [PATCH v7 0/4] s390: stop abusing memory_region_allocate_system_memory()
Posted by Christian Borntraeger 4 years, 5 months ago
On 24.09.19 16:47, Igor Mammedov wrote:
> Changelog:
>   since v6:
>     - include and rebase on top of
>        [PATCH 0/2] kvm: clear dirty bitmaps from all overlapping memslots
>         https://www.mail-archive.com/qemu-devel@nongnu.org/msg646200.html
>     - minor fixups suggested during v6 review
>     - more testing incl. hacked x86
>   since v5:
>     - [1/2] fix migration that wasn't starting and make sure that KVM part
>       is able to handle 1:n MemorySection:memslot arrangement
>   since v3:
>     - fix compilation issue
>     - advance HVA along with GPA in kvm_set_phys_mem()
>   since v2:
>     - break migration from old QEMU (since 2.12-4.1) for guest with >8TB RAM
>       and drop migratable aliases patch as was agreed during v2 review
>     - drop 4.2 machines patch as it's not prerequisite anymore
>   since v1:
>     - include 4.2 machines patch for adding compat RAM layout on top
>     - 2/4 add missing in v1 patch for splitting too big MemorySection on
>           several memslots
>     - 3/4 amend code path on alias destruction to ensure that RAMBlock is
>           cleaned properly
>     - 4/4 add compat machine code to keep old layout (migration-wise) for
>           4.1 and older machines 
> 
> 
> While looking into unifying guest RAM allocation to use hostmem backends
> for initial RAM (especially when -mempath is used) and retiring
> memory_region_allocate_system_memory() API, leaving only single hostmem backend,
> I was inspecting how currently it is used by boards and it turns out several
> boards abuse it by calling the function several times (despite documented contract
> forbiding it).
> 
> s390 is one of such boards where KVM limitation on memslot size got propagated
> to board design and memory_region_allocate_system_memory() was abused to satisfy
> KVM requirement for max RAM chunk where memory region alias would suffice.
> 
> Unfortunately, memory_region_allocate_system_memory() usage created migration
> dependency where guest RAM is transferred in migration stream as several RAMBlocks
> if it's more than KVM_SLOT_MAX_BYTES. During v2 review it was agreed to ignore
> migration breakage (documenting it in release notes) and leaving only KVM fix.
> 
> In order to replace these several RAM chunks with a single memdev and keep it
> working with KVM memslot size limit, the later was modified to deal with 
> memory section split on several KVMSlots and manual RAM splitting in s390
> was replace by single memory_region_allocate_system_memory() call.
> 
> Tested:
>   * s390 with hacked KVM_SLOT_MAX_BYTES = 128Mb
>       - guest reboot cycle in ping-pong migration
>   * x86 with hacke max memslot = 128 and manual_dirty_log_protect enabled
>       - ping-pong migration with workload dirtying RAM around a split area
> 

Thanks, v7 applied to s390-next.