docs/devel/snapshot.rst | 26 +++++++ hw/i386/Kconfig | 1 + hw/misc/Kconfig | 3 + hw/misc/meson.build | 1 + hw/misc/snapshot.c | 164 ++++++++++++++++++++++++++++++++++++++++ migration/savevm.c | 84 ++++++++++++++++++++ migration/savevm.h | 3 + 7 files changed, 282 insertions(+) create mode 100644 docs/devel/snapshot.rst create mode 100644 hw/misc/snapshot.c
This RFC adds a virtual device for snapshot/restores within QEMU. I am working on this as a part of QEMU Google Summer of Code 2022. Fast snapshot/restores within QEMU is helpful for code fuzzing. I reused the migration code for saving and restoring virtual device and CPU state. As for the RAM, I am using a simple COW mmaped file to do restores. The loadvm migration function I used for doing restores only worked after I called it from a qemu_bh. I'm not sure if I should run the migration code in a separate thread (see patch 3), since currently it is running as a part of the device code in the vCPU thread. This is a rough first revision and feedback on the cpu and device state restores is appreciated. To test locally, boot up any linux distro. I used the following C file to interact with the PCI snapshot device: #include <stdio.h> #include <stdint.h> #include <fcntl.h> #include <sys/mman.h> #include <unistd.h> int main() { int fd = open("/sys/bus/pci/devices/0000:00:04.0/resource0", O_RDWR | O_SYNC); size_t size = 1024 * 1024; uint32_t* memory = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); printf("%x\n", memory[0]); int a = 0; memory[0] = 0x101; // save snapshot printf("before: value of a = %d\n", a); a = 1; printf("middle: value of a = %d\n", a); memory[0] = 0x102; // load snapshot printf("after: value of a = %d\n", a); return 0; } Richard Liu (3): create skeleton snapshot device and add docs implement ram save/restore use migration code for cpu and device save/restore docs/devel/snapshot.rst | 26 +++++++ hw/i386/Kconfig | 1 + hw/misc/Kconfig | 3 + hw/misc/meson.build | 1 + hw/misc/snapshot.c | 164 ++++++++++++++++++++++++++++++++++++++++ migration/savevm.c | 84 ++++++++++++++++++++ migration/savevm.h | 3 + 7 files changed, 282 insertions(+) create mode 100644 docs/devel/snapshot.rst create mode 100644 hw/misc/snapshot.c -- 2.35.1
Hi Richard, On 7/22/22 21:20, Richard Liu wrote: > This RFC adds a virtual device for snapshot/restores within QEMU. I am working > on this as a part of QEMU Google Summer of Code 2022. Fast snapshot/restores > within QEMU is helpful for code fuzzing. > > I reused the migration code for saving and restoring virtual device and CPU > state. As for the RAM, I am using a simple COW mmaped file to do restores. > > The loadvm migration function I used for doing restores only worked after I > called it from a qemu_bh. I'm not sure if I should run the migration code in a > separate thread (see patch 3), since currently it is running as a part of the > device code in the vCPU thread. > > This is a rough first revision and feedback on the cpu and device state restores > is appreciated. As I understand it, usually the save and restore of VM state in QEMU can best be managed by libvirt APIs, and for example using the libvirt command line tool virsh: $ virsh save (or managedsave) $ virsh restore (or start) These commands start a QEMU migration using the QMP protocol to a file descriptor, previously opened by libvirt to contain the state file. (getfd QMP command): https://qemu-project.gitlab.io/qemu/interop/qemu-qmp-ref.html#qapidoc-2811 (migrate QMP command): https://qemu-project.gitlab.io/qemu/interop/qemu-qmp-ref.html#qapidoc-1947 This is unfortunately currently very slow. Maybe you could help thinking out or with the implementation of the solution? I tried to push this approach that only involves libvirt, using the existing QEMU multifd migration to a socket: https://listman.redhat.com/archives/libvir-list/2022-June/232252.html performance is very good compared with what is possible today, but it won't be upstreamable because it is not deemed optimal, and libvirt wants the code to be in QEMU. What about helping in thinking out how the QEMU-based solution could look like? The requirements for now in my view seem to be: * avoiding the kernel file page trashing for large transfers which currently requires in my view changing QEMU to be able to migrate a stream to an fd that is open with O_DIRECT. In practice this means somehow making all QEMU migration stream writes block-friendly (adding some buffering?). * allow concurrent parallel transfers to be able to use extra cpu resources to speed up the transfer if such resources are available. * we should be able to transfer multiple GB/s with modern nvmes for super fast VM state save and restore (few seconds even for a 30GB VM), and we should do no worse than the prototype fully implemented in libvirt, otherwise it would not make sense to implement it in QEMU. What do you think? Ciao, Claudio > > To test locally, boot up any linux distro. I used the following C file to > interact with the PCI snapshot device: > > #include <stdio.h> > #include <stdint.h> > #include <fcntl.h> > #include <sys/mman.h> > #include <unistd.h> > > int main() { > int fd = open("/sys/bus/pci/devices/0000:00:04.0/resource0", O_RDWR | O_SYNC); > size_t size = 1024 * 1024; > uint32_t* memory = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); > > printf("%x\n", memory[0]); > > int a = 0; > memory[0] = 0x101; // save snapshot > printf("before: value of a = %d\n", a); > a = 1; > printf("middle: value of a = %d\n", a); > memory[0] = 0x102; // load snapshot > printf("after: value of a = %d\n", a); > > return 0; > } > > Richard Liu (3): > create skeleton snapshot device and add docs > implement ram save/restore > use migration code for cpu and device save/restore > > docs/devel/snapshot.rst | 26 +++++++ > hw/i386/Kconfig | 1 + > hw/misc/Kconfig | 3 + > hw/misc/meson.build | 1 + > hw/misc/snapshot.c | 164 ++++++++++++++++++++++++++++++++++++++++ > migration/savevm.c | 84 ++++++++++++++++++++ > migration/savevm.h | 3 + > 7 files changed, 282 insertions(+) > create mode 100644 docs/devel/snapshot.rst > create mode 100644 hw/misc/snapshot.c >
On 220722 2210, Claudio Fontana wrote: > Hi Richard, > > On 7/22/22 21:20, Richard Liu wrote: > > This RFC adds a virtual device for snapshot/restores within QEMU. I am working > > on this as a part of QEMU Google Summer of Code 2022. Fast snapshot/restores > > within QEMU is helpful for code fuzzing. > > > > I reused the migration code for saving and restoring virtual device and CPU > > state. As for the RAM, I am using a simple COW mmaped file to do restores. > > > > The loadvm migration function I used for doing restores only worked after I > > called it from a qemu_bh. I'm not sure if I should run the migration code in a > > separate thread (see patch 3), since currently it is running as a part of the > > device code in the vCPU thread. > > > > This is a rough first revision and feedback on the cpu and device state restores > > is appreciated. > > As I understand it, usually the save and restore of VM state in QEMU can best be > managed by libvirt APIs, and for example using the libvirt command line tool virsh: > > $ virsh save (or managedsave) > > $ virsh restore (or start) > > These commands start a QEMU migration using the QMP protocol to a file descriptor, > previously opened by libvirt to contain the state file. > > (getfd QMP command): > https://qemu-project.gitlab.io/qemu/interop/qemu-qmp-ref.html#qapidoc-2811 > > (migrate QMP command): > https://qemu-project.gitlab.io/qemu/interop/qemu-qmp-ref.html#qapidoc-1947 > > This is unfortunately currently very slow. > > Maybe you could help thinking out or with the implementation of the solution? > I tried to push this approach that only involves libvirt, using the existing QEMU multifd migration to a socket: > > https://listman.redhat.com/archives/libvir-list/2022-June/232252.html > > performance is very good compared with what is possible today, but it won't be upstreamable because it is not deemed optimal, and libvirt wants the code to be in QEMU. > > What about helping in thinking out how the QEMU-based solution could look like? > > The requirements for now in my view seem to be: > > * avoiding the kernel file page trashing for large transfers > which currently requires in my view changing QEMU to be able to migrate a stream to an fd that is open with O_DIRECT. > In practice this means somehow making all QEMU migration stream writes block-friendly (adding some buffering?). > > * allow concurrent parallel transfers > to be able to use extra cpu resources to speed up the transfer if such resources are available. > > * we should be able to transfer multiple GB/s with modern nvmes for super fast VM state save and restore (few seconds even for a 30GB VM), > and we should do no worse than the prototype fully implemented in libvirt, otherwise it would not make sense to implement it in QEMU. > > What do you think? Hi Claudio, These changes aim to restore a VM hundreds-thousands of times per second within the same process. Do you think that is achievable with the design of qmp migrate? We want to to avoid serializing/transferring all of memory over the FD. So right now, this series only uses migration code for device state. Right now (in 3/3), the memory is "restored" simply be re-mmapping MAP_PRIVATE from file-backed memory. However, future versions might use dirty-page-tracking with a shadow memory-snapshot, to avoid the page-faults that result from the mmap + MAP_PRIVATE approach. In terms of the way the guest initiates snapshots/restores, maybe there is a neater way to do this with QMP, by providing the guest with access to qmp via a serial device. That way, we avoid the need for a custom virtual-device. Right now, the snapshots are requested/restored over MMIO, since we need to make snapshots at precise locations in the guest's execution (i.e. a specific program counter in a process running in the guest). I wonder if there is a way to achieve that with qmp forwarded to the guest. -Alex > > Ciao, > > Claudio > > > > > To test locally, boot up any linux distro. I used the following C file to > > interact with the PCI snapshot device: > > > > #include <stdio.h> > > #include <stdint.h> > > #include <fcntl.h> > > #include <sys/mman.h> > > #include <unistd.h> > > > > int main() { > > int fd = open("/sys/bus/pci/devices/0000:00:04.0/resource0", O_RDWR | O_SYNC); > > size_t size = 1024 * 1024; > > uint32_t* memory = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); > > > > printf("%x\n", memory[0]); > > > > int a = 0; > > memory[0] = 0x101; // save snapshot > > printf("before: value of a = %d\n", a); > > a = 1; > > printf("middle: value of a = %d\n", a); > > memory[0] = 0x102; // load snapshot > > printf("after: value of a = %d\n", a); > > > > return 0; > > } > > > > Richard Liu (3): > > create skeleton snapshot device and add docs > > implement ram save/restore > > use migration code for cpu and device save/restore > > > > docs/devel/snapshot.rst | 26 +++++++ > > hw/i386/Kconfig | 1 + > > hw/misc/Kconfig | 3 + > > hw/misc/meson.build | 1 + > > hw/misc/snapshot.c | 164 ++++++++++++++++++++++++++++++++++++++++ > > migration/savevm.c | 84 ++++++++++++++++++++ > > migration/savevm.h | 3 + > > 7 files changed, 282 insertions(+) > > create mode 100644 docs/devel/snapshot.rst > > create mode 100644 hw/misc/snapshot.c > > >
© 2016 - 2024 Red Hat, Inc.