This series is essentially V1 of a prior RFC [1] to support QEMU's
mapped-ram stream format [2] and migration capability. Along with
supporting mapped-ram, it implements a design approach we discussed
for supporting parallel save/restore [3]. In summary, the approach is
1. Add mapped-ram migration capability
2. Steal an element from save header 'unused' for a 'features' variable
and bump save version to 3.
3. Add /etc/libvirt/qemu.conf knob for the save format version,
defaulting to latest v3
4. Use v3 (aka mapped-ram) by default
5. Use mapped-ram with BYPASS_CACHE for v3, old approach for v2
6. include: Define constants for parallel save/restore
7. qemu: Add support for parallel save. Implies mapped-ram, reject if v2
8. qemu: Add support for parallel restore. Implies mapped-ram.
Reject if v2
9. tools: add parallel parameter to virsh save command
10. tools: add parallel parameter to virsh restore command
With this series, saving and restoring using mapped-ram is enabled by
default if the underlying QEMU advertises the mapped-ram migration
capability. It can be disabled by changing the 'save_image_version'
setting in qemu.conf.
To use mapped-ram with QEMU:
- The 'mapped-ram' migration capability must be set to true
- The 'multifd' migration capability must be set to true and
the 'multifd-channels' migration parameter must set to a
value >= 1
- QEMU must be provided an fdset containing the migration fd(s)
- The 'migrate' qmp command is invoked with a URI referencing the fdset
and an offset where to start reading or writing the data stream, e.g.
{"execute":"migrate",
"arguments":{"detach":true,"resume":false,
"uri":"file:/dev/fdset/0,offset=0x11921"}}
The mapped-ram stream, in conjunction with direct IO and multifd, can
significantly improve the time required to save VM memory state. The
following tables compare mapped-ram with the existing, sequential save
stream. In all cases, the save and restore operations are to/from a
block device comprised of two NVMe disks in RAID0 configuration with
xfs (~8600MiB/s). The values in the 'save time' and 'restore time'
columns were scraped from the 'real' time reported by time(1). The
'Size' and 'Blocks' columns were provided by the corresponding
outputs of stat(1).
VM: 32G RAM, 1 vcpu, idle (shortly after boot)
| save | restore |
| time | time | Size | Blocks
-----------------------+---------+---------+--------------+--------
legacy | 6.193s | 4.399s | 985744812 | 1925288
-----------------------+---------+---------+--------------+--------
mapped-ram | 5.109s | 1.176s | 34368554354 | 1774472
-----------------------+---------+---------+--------------+--------
legacy + direct IO | 5.725s | 4.512s | 985765251 | 1925328
-----------------------+---------+---------+--------------+--------
mapped-ram + direct IO | 4.627s | 1.490s | 34368554354 | 1774304
-----------------------+---------+---------+--------------+--------
mapped-ram + direct IO | | | |
+ multifd-channels=8 | 4.421s | 0.845s | 34368554318 | 1774312
-------------------------------------------------------------------
VM: 32G RAM, 30G dirty, 1 vcpu in tight loop dirtying memory
| save | restore |
| time | time | Size | Blocks
-----------------------+---------+---------+--------------+---------
legacy | 25.800s | 14.332s | 33154309983 | 64754512
-----------------------+---------+---------+--------------+---------
mapped-ram | 18.742s | 15.027s | 34368559228 | 64617160
-----------------------+---------+---------+--------------+---------
legacy + direct IO | 13.115s | 18.050s | 33154310496 | 64754520
-----------------------+---------+---------+--------------+---------
mapped-ram + direct IO | 13.623s | 15.959s | 34368557392 | 64662040
-----------------------+-------- +---------+--------------+---------
mapped-ram + direct IO | | | |
+ multifd-channels=8 | 6.994s | 6.470s | 34368554980 | 64665776
--------------------------------------------------------------------
As can be seen from the tables, one caveat of mapped-ram is the logical file
size of a saved image is basically equivalent to the VM memory size. Note
however that mapped-ram typically uses fewer blocks on disk.
Support for mapped-ram+direct-io only recently landed in upstream QEMU
and will first appear in the 9.1 release, which may complicate merging
support in libvirt. Specifically, I'm not sure how to detect if the
combination is supported by QEMU. Suggestions welcomed.
Similar to the RFC, V1 ignores compression. libvirt currently supports
compression by connecting the output of QEMU's save stream to the specified
compression program via a pipe. This approach is incompatible with mapped-ram
since the fd provided to QEMU must be seekable. In general, we can consider
mapped-ram and compression incompatible and document they cannot be used
together.
[1] https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/message/EF6YS5YIPYF2JXFMSKP6OLEJ2XWXJ3XW/
[2] https://gitlab.com/qemu-project/qemu/-/blob/master/docs/devel/migration/mapped-ram.rst?ref_type=heads
[3] https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/message/K4BDDJDMJ22XMJEFAUE323H5S5E47VQX/
Claudio Fontana (2):
include: Define constants for parallel save/restore
tools: add parallel parameter to virsh restore command
Jim Fehlig (17):
lib: virDomainSaveParams: Ensure absolute save path
qemu_fd: Add function to retrieve fdset ID
qemu: Add function to check capability in migration params
qemu: Add function to get bool value from migration params
qemu: Add mapped-ram migration capability
qemu: Add function to get migration params for save
qemu: QEMU_SAVE_VERSION: Bump to version 3
qemu: conf: Add setting for save image version
qemu: Add helper function for creating save image fd
qemu: Add support for mapped-ram on save
qemu: Decompose qemuSaveImageOpen
qemu: Move creation of qemuProcessIncomingDef struct
qemu: Apply migration parameters in qemuMigrationDstRun
qemu: Add support for mapped-ram on restore
qemu: Support O_DIRECT with mapped-ram on save
qemu: Support O_DIRECT with mapped-ram on restore
qemu: Add support for parallel save and restore
Li Zhang (1):
tools: add parallel parameter to virsh save command
docs/manpages/virsh.rst | 9 +-
include/libvirt/libvirt-domain.h | 13 ++
src/libvirt-domain.c | 52 +++++--
src/qemu/libvirtd_qemu.aug | 1 +
src/qemu/qemu.conf.in | 6 +
src/qemu/qemu_conf.c | 16 +++
src/qemu/qemu_conf.h | 5 +
src/qemu/qemu_driver.c | 104 +++++++++-----
src/qemu/qemu_fd.c | 18 +++
src/qemu/qemu_fd.h | 3 +
src/qemu/qemu_migration.c | 192 +++++++++++++++++--------
src/qemu/qemu_migration.h | 9 +-
src/qemu/qemu_migration_params.c | 86 ++++++++++++
src/qemu/qemu_migration_params.h | 17 +++
src/qemu/qemu_monitor.c | 39 ++++++
src/qemu/qemu_monitor.h | 5 +
src/qemu/qemu_process.c | 120 +++++++++++-----
src/qemu/qemu_process.h | 19 ++-
src/qemu/qemu_saveimage.c | 216 ++++++++++++++++++++---------
src/qemu/qemu_saveimage.h | 35 +++--
src/qemu/qemu_snapshot.c | 26 ++--
src/qemu/test_libvirtd_qemu.aug.in | 1 +
tools/virsh-domain.c | 79 +++++++++--
23 files changed, 827 insertions(+), 244 deletions(-)
--
2.35.3