Hi,
In this v4:
- Added support for 'fd:'. With fixed-ram, that comes free by the
existing routing to file.c. With multifd I added a loop to create
the channels.
- Dropped support for direct-io with fixed-ram _without_ multifd. This
is something I said I would do for this version, but I had to drop
it because performance is really bad. I think the single-threaded
precopy code cannot cope with the extra latency/synchronicity of
O_DIRECT.
- Dropped QIOTask related changes. The file migration now calls
multifd_channel_connect() directly. Any error can now be returned
all the way up to migrate_fd_connect(). We can also skip the
channels_created semaphore logic when using fixed-ram.
- Moved the pwritev_read_contiguous code into a migration-specific
file and dropped the write_base trick.
- Reduced the number of syncs to just one every ram iteration and one
at the end on the send side; and a single one at the end on the recv
side. The EOS flag cannot be skipped because it is used in control
flow at ram_load_precopy.
The rest are minor changes, I have noted them in the patches
themselves.
CI run: https://gitlab.com/farosas/qemu/-/pipelines/1183853433
Series structure
================
This series enables fixed-ram in steps:
0) Cleanups [1-5]
1) QIOChannel interfaces [6-10]
2) Fixed-ram format for precopy [11-15]
3) Multifd adaptation without packets [16-19]
4) Fixed-ram format for multifd [20-26]
5) Direct-io generic support [27]
6) Direct-io for fixed-ram multifd with file: URI [28-29]
7) Fdset interface for fixed-ram multifd [30-34]
The majority of changes for this version are at step 3 due to the
rebase on top of the recent multifd cleanups.
Please take a look at the later patches in the series, step 5 onwards.
About fixed-ram
===============
Fixed-ram is a new stream format for the RAM section designed to
supplement the existing ``file:`` migration and make it compatible
with ``multifd``. This enables parallel migration of a guest's RAM to
a file.
The core of the feature is to ensure that each RAM page has a specific
offset in the resulting migration file. This enables the ``multifd``
threads to write exclusively to those offsets even if the guest is
constantly dirtying pages (i.e. live migration).
Another benefit is that the resulting file will have a bounded size,
since pages which are dirtied multiple times will always go to a fixed
location in the file, rather than constantly being added to a
sequential stream.
Having the pages at fixed offsets also allows the usage of O_DIRECT
for save/restore of the migration stream as the pages are ensured to
be written respecting O_DIRECT alignment restrictions.
Latest numbers
==============
=> guest: 128 GB RAM - 120 GB dirty - 1 vcpu in tight loop dirtying memory
=> host: 128 CPU AMD EPYC 7543 - 2 NVMe disks in RAID0 (8586 MiB/s) - xfs
=> pinned vcpus w/ NUMA shortest distances - average of 3 runs - results
from query-migrate
non-live | time (ms) pages/s mb/s MB/s
-------------------+-----------------------------------
file | 110512 256258 9549 1193
+ bg-snapshot | 245660 119581 4303 537
-------------------+-----------------------------------
fixed-ram | 157975 216877 6672 834
+ multifd 8 ch. | 95922 292178 10982 1372
+ direct-io | 23268 1936897 45330 5666
-------------------------------------------------------
live | time (ms) pages/s mb/s MB/s
-------------------+-----------------------------------
file | - - - - (file grew 4x the VM size)
+ bg-snapshot | 357635 141747 2974 371
-------------------+-----------------------------------
fixed-ram | - - - - (no convergence in 5 min)
+ multifd 8 ch. | 230812 497551 14900 1862
+ direct-io | 27475 1788025 46736 5842
-------------------------------------------------------
Previous versions of this patchset have shown performance closer to
disk saturation, but due to the query-migrate bug[1] it's hard to be
confident in the previous numbers. I don't discard the possibility of
a performance regression, but for now I can't spot anything that could
have caused it.
1- https://lore.kernel.org/r/20240219194457.26923-1-farosas@suse.de
v3:
https://lore.kernel.org/r/20231127202612.23012-1-farosas@suse.de
v2:
https://lore.kernel.org/r/20231023203608.26370-1-farosas@suse.de
v1:
https://lore.kernel.org/r/20230330180336.2791-1-farosas@suse.de
Fabiano Rosas (31):
docs/devel/migration.rst: Document the file transport
tests/qtest/migration: Rename fd_proto test
tests/qtest/migration: Add a fd + file test
migration/multifd: Remove p->quit from recv side
migration/multifd: Release recv sem_sync earlier
io: fsync before closing a file channel
migration/qemu-file: add utility methods for working with seekable
channels
migration/ram: Introduce 'fixed-ram' migration capability
migration: Add fixed-ram URI compatibility check
migration/ram: Add outgoing 'fixed-ram' migration
migration/ram: Add incoming 'fixed-ram' migration
tests/qtest/migration: Add tests for fixed-ram file-based migration
migration/multifd: Rename MultiFDSend|RecvParams::data to
compress_data
migration/multifd: Decouple recv method from pages
migration/multifd: Allow multifd without packets
migration/multifd: Allow receiving pages without packets
migration/multifd: Add outgoing QIOChannelFile support
migration/multifd: Add incoming QIOChannelFile support
migration/multifd: Prepare multifd sync for fixed-ram migration
migration/multifd: Support outgoing fixed-ram stream format
migration/multifd: Support incoming fixed-ram stream format
migration/multifd: Add fixed-ram support to fd: URI
tests/qtest/migration: Add a multifd + fixed-ram migration test
migration: Add direct-io parameter
migration/multifd: Add direct-io support
tests/qtest/migration: Add tests for file migration with direct-io
monitor: Honor QMP request for fd removal immediately
monitor: Extract fdset fd flags comparison into a function
monitor: fdset: Match against O_DIRECT
migration: Add support for fdset with multifd + file
tests/qtest/migration: Add a test for fixed-ram with passing of fds
Nikolay Borisov (3):
io: add and implement QIO_CHANNEL_FEATURE_SEEKABLE for channel file
io: Add generic pwritev/preadv interface
io: implement io_pwritev/preadv for QIOChannelFile
docs/devel/migration/features.rst | 1 +
docs/devel/migration/fixed-ram.rst | 137 +++++++++
docs/devel/migration/main.rst | 22 ++
include/exec/ramblock.h | 13 +
include/io/channel.h | 83 ++++++
include/migration/qemu-file-types.h | 2 +
include/qemu/bitops.h | 13 +
include/qemu/osdep.h | 2 +
io/channel-file.c | 69 +++++
io/channel.c | 58 ++++
migration/fd.c | 30 ++
migration/fd.h | 1 +
migration/file.c | 258 +++++++++++++++-
migration/file.h | 9 +
migration/migration-hmp-cmds.c | 11 +
migration/migration.c | 68 ++++-
migration/multifd-zlib.c | 26 +-
migration/multifd-zstd.c | 26 +-
migration/multifd.c | 436 +++++++++++++++++++++-------
migration/multifd.h | 27 +-
migration/options.c | 66 +++++
migration/options.h | 2 +
migration/qemu-file.c | 106 +++++++
migration/qemu-file.h | 6 +
migration/ram.c | 333 ++++++++++++++++++++-
migration/ram.h | 1 +
migration/savevm.c | 1 +
monitor/fds.c | 27 +-
qapi/migration.json | 24 +-
tests/qtest/migration-helpers.c | 42 +++
tests/qtest/migration-helpers.h | 1 +
tests/qtest/migration-test.c | 303 ++++++++++++++++++-
util/osdep.c | 9 +
33 files changed, 2041 insertions(+), 172 deletions(-)
create mode 100644 docs/devel/migration/fixed-ram.rst
--
2.35.3