migration/migration.c | 9 +++++---- migration/postcopy-ram.c | 25 ++++++++----------------- migration/ram.c | 18 ++++++++++++++---- 3 files changed, 27 insertions(+), 25 deletions(-)
This patch series introduces a set of fixes to the previous work proposed by Hailiang Zhang to enable in QEMU live memory snapshot based on userfaultfd. See discussion here: http://www.mail-archive.com/qemu-devel@nongnu.org/msg393118.html These patches apply on top of: https://github.com/coloft/qemu/tree/snapshot-v2 that is the latest version of Hailiang's work, and rely on the latest work on userfaultfd available on Andrea Arcangeli's Linux kernel tree: https://git.kernel.org/cgit/linux/kernel/git/andrea/aa.git/log/?h=userfault The original work was mainly tested on x86 tcg machines and was not working ARM/ARM64 tcg. The fixes presented in this series enable the live memory snapshot to work for ARM64 tcg guests running on top of an ARM64 host. The main problems encountered were: - QEMU uses for ARM a memory page size of 1KB. Even though this size is not supported by the Linux kernel, is is kept for backward compatibility with older ARM CPU MMUs. Initial work was write-unprotecting pages with a granularity not always aligned with host page size, causing userfaultfd to fail. - The VM execution was resumed right before the status of the migration was switched from MIGRATION_STATUS_SETUP to MIGRATION_STATUS_ACTIVE. This was causing again the VM to trigger a "Bus error", due to wrong status of some memory pages. - When unprotecting a memory page the flag UFFDIO_WRITEPROTECT_MODE_DONTWAKE was used. This way, after a page is copied into snapshot file, the virtual machine execution is not resumed. To test the patches on an ARM64 host, boot an ARM64 tcg machine: qemu-system-aarch64 -machine virt,accel=tcg -cpu cortex-a57\ -m 256 -kernel Image \ -initrd rootfs.cpio.gz \ -append "earlyprintk rw console=ttyAMA0" \ -net nic -net user \ -nographic -serial pty -monitor stdio start migration from QEMU monitor: (qemu) migrate file:/root/test_snapshot resume VM form snapshot: qemu-system-aarch64 -machine virt,accel=tcg -cpu cortex-a57\ -m 256 -kernel Image \ -initrd rootfs.cpio.gz \ -append "earlyprintk rw console=ttyAMA0" \ -net nic -net user \ -nographic -serial stdio -monitor pty \ -incoming file:/root/test_snapshot Christian Pinto (4): migration/postcopy-ram: check pagefault flags in userfaultfd thread migration/ram: Fix for ARM/ARM64 page size migration: snapshot thread migration/postcopy-ram: ram_set_pages_wp fix migration/migration.c | 9 +++++---- migration/postcopy-ram.c | 25 ++++++++----------------- migration/ram.c | 18 ++++++++++++++---- 3 files changed, 27 insertions(+), 25 deletions(-) -- 2.11.0
* Christian Pinto (c.pinto@virtualopensystems.com) wrote: > This patch series introduces a set of fixes to the previous work proposed by > Hailiang Zhang to enable in QEMU live memory snapshot based > on userfaultfd. See discussion here: > http://www.mail-archive.com/qemu-devel@nongnu.org/msg393118.html Thanks for posting this, > These patches apply on top of: > https://github.com/coloft/qemu/tree/snapshot-v2 > that is the latest version of Hailiang's work, and rely on the latest work on > userfaultfd available on Andrea Arcangeli's Linux kernel tree: > https://git.kernel.org/cgit/linux/kernel/git/andrea/aa.git/log/?h=userfault > > The original work was mainly tested on x86 tcg machines and was not working > ARM/ARM64 tcg. > The fixes presented in this series enable the live memory snapshot > to work for ARM64 tcg guests running on top of an ARM64 host. > > The main problems encountered were: > - QEMU uses for ARM a memory page size of 1KB. Even though this size is not > supported by the Linux kernel, is is kept for backward compatibility > with older ARM CPU MMUs. Initial work was write-unprotecting pages with > a granularity not always aligned with host page size, causing userfaultfd > to fail. Yes, Power similarly has a 4kb size for the target page size even though the host kernel is normally a large page size. > - The VM execution was resumed right before the status of the migration > was switched from MIGRATION_STATUS_SETUP to MIGRATION_STATUS_ACTIVE. > This was causing again the VM to trigger a "Bus error", due to wrong > status of some memory pages. > - When unprotecting a memory page the flag > UFFDIO_WRITEPROTECT_MODE_DONTWAKE was used. This way, after a page is > copied into snapshot file, the virtual machine execution is not resumed. > > > To test the patches on an ARM64 host, boot an ARM64 tcg machine: > > qemu-system-aarch64 -machine virt,accel=tcg -cpu cortex-a57\ > -m 256 -kernel Image \ > -initrd rootfs.cpio.gz \ > -append "earlyprintk rw console=ttyAMA0" \ > -net nic -net user \ > -nographic -serial pty -monitor stdio > > start migration from QEMU monitor: > > (qemu) migrate file:/root/test_snapshot > > > resume VM form snapshot: > > qemu-system-aarch64 -machine virt,accel=tcg -cpu cortex-a57\ > -m 256 -kernel Image \ > -initrd rootfs.cpio.gz \ > -append "earlyprintk rw console=ttyAMA0" \ > -net nic -net user \ > -nographic -serial stdio -monitor pty \ > -incoming file:/root/test_snapshot Nice, what's your use case and how are you dealing with storage? Dave > Christian Pinto (4): > migration/postcopy-ram: check pagefault flags in userfaultfd thread > migration/ram: Fix for ARM/ARM64 page size > migration: snapshot thread > migration/postcopy-ram: ram_set_pages_wp fix > > migration/migration.c | 9 +++++---- > migration/postcopy-ram.c | 25 ++++++++----------------- > migration/ram.c | 18 ++++++++++++++---- > 3 files changed, 27 insertions(+), 25 deletions(-) > > -- > 2.11.0 > -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
Hello Alan, On 09/03/2017 18:46, Dr. David Alan Gilbert wrote: > * Christian Pinto (c.pinto@virtualopensystems.com) wrote: >> This patch series introduces a set of fixes to the previous work proposed by >> Hailiang Zhang to enable in QEMU live memory snapshot based >> on userfaultfd. See discussion here: >> http://www.mail-archive.com/qemu-devel@nongnu.org/msg393118.html > Thanks for posting this, > >> These patches apply on top of: >> https://github.com/coloft/qemu/tree/snapshot-v2 >> that is the latest version of Hailiang's work, and rely on the latest work on >> userfaultfd available on Andrea Arcangeli's Linux kernel tree: >> https://git.kernel.org/cgit/linux/kernel/git/andrea/aa.git/log/?h=userfault >> >> The original work was mainly tested on x86 tcg machines and was not working >> ARM/ARM64 tcg. >> The fixes presented in this series enable the live memory snapshot >> to work for ARM64 tcg guests running on top of an ARM64 host. >> >> The main problems encountered were: >> - QEMU uses for ARM a memory page size of 1KB. Even though this size is not >> supported by the Linux kernel, is is kept for backward compatibility >> with older ARM CPU MMUs. Initial work was write-unprotecting pages with >> a granularity not always aligned with host page size, causing userfaultfd >> to fail. > Yes, Power similarly has a 4kb size for the target page size even though > the host kernel is normally a large page size. The fix included in this series should solve the problem for Power as well, since it is making sure the address passed to userfaultfd is aligned to the host page size. So, if someone in the Power community is interested in this functionality, this fix might come handy. > >> - The VM execution was resumed right before the status of the migration >> was switched from MIGRATION_STATUS_SETUP to MIGRATION_STATUS_ACTIVE. >> This was causing again the VM to trigger a "Bus error", due to wrong >> status of some memory pages. >> - When unprotecting a memory page the flag >> UFFDIO_WRITEPROTECT_MODE_DONTWAKE was used. This way, after a page is >> copied into snapshot file, the virtual machine execution is not resumed. >> >> >> To test the patches on an ARM64 host, boot an ARM64 tcg machine: >> >> qemu-system-aarch64 -machine virt,accel=tcg -cpu cortex-a57\ >> -m 256 -kernel Image \ >> -initrd rootfs.cpio.gz \ >> -append "earlyprintk rw console=ttyAMA0" \ >> -net nic -net user \ >> -nographic -serial pty -monitor stdio >> >> start migration from QEMU monitor: >> >> (qemu) migrate file:/root/test_snapshot >> >> >> resume VM form snapshot: >> >> qemu-system-aarch64 -machine virt,accel=tcg -cpu cortex-a57\ >> -m 256 -kernel Image \ >> -initrd rootfs.cpio.gz \ >> -append "earlyprintk rw console=ttyAMA0" \ >> -net nic -net user \ >> -nographic -serial stdio -monitor pty \ >> -incoming file:/root/test_snapshot > Nice, what's your use case and how are you dealing with storage? This is a work done in the context of a H2020 European Project named ExaNoDe (http://exanode.eu) that is building a prototype ARM64 based compute node for the exascale (computing capabilities in the order of the Exaflop) domain. In this project, targeting HPC, scientific applications using MPI will be executed in virtualized computing nodes (KVM VMs), rather than directly on physical machines. This is mainly to improve the manageability of the overall system and ease the task of separating different workloads. The work done on live memory snapshot is meant to tackle the problem of system resiliency, reducing the overall impact on the virtualized software, and leading to higher availability of the virtualized computing nodes. For the time being we are focusing on memory, and storage has not yet been taken into consideration. However, at a first glance I would say that storage in QEMU is already using CoW that could be useful for this scenario as well. Thanks, Christian > > Dave > >> Christian Pinto (4): >> migration/postcopy-ram: check pagefault flags in userfaultfd thread >> migration/ram: Fix for ARM/ARM64 page size >> migration: snapshot thread >> migration/postcopy-ram: ram_set_pages_wp fix >> >> migration/migration.c | 9 +++++---- >> migration/postcopy-ram.c | 25 ++++++++----------------- >> migration/ram.c | 18 ++++++++++++++---- >> 3 files changed, 27 insertions(+), 25 deletions(-) >> >> -- >> 2.11.0 >> > -- > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
© 2016 - 2024 Red Hat, Inc.