From: Peter Xu <peterx@redhat.com>
Reorganize the page, moving things around, and add a few
headlines ("Postcopy internals", "Postcopy features") to cover sub-areas.
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/r/20240109064628.595453-9-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
docs/devel/migration/postcopy.rst | 159 ++++++++++++++++--------------
1 file changed, 84 insertions(+), 75 deletions(-)
diff --git a/docs/devel/migration/postcopy.rst b/docs/devel/migration/postcopy.rst
index d60eec06ab..6c51e96d79 100644
--- a/docs/devel/migration/postcopy.rst
+++ b/docs/devel/migration/postcopy.rst
@@ -1,6 +1,9 @@
+========
Postcopy
========
+.. contents::
+
'Postcopy' migration is a way to deal with migrations that refuse to converge
(or take too long to converge) its plus side is that there is an upper bound on
the amount of migration traffic and time it takes, the down side is that during
@@ -14,7 +17,7 @@ Postcopy can be combined with precopy (i.e. normal migration) so that if precopy
doesn't finish in a given time the switch is made to postcopy.
Enabling postcopy
------------------
+=================
To enable postcopy, issue this command on the monitor (both source and
destination) prior to the start of migration:
@@ -49,8 +52,71 @@ time per vCPU.
``migrate_set_parameter`` is ignored (to avoid delaying requested pages that
the destination is waiting for).
-Postcopy device transfer
-------------------------
+Postcopy internals
+==================
+
+State machine
+-------------
+
+Postcopy moves through a series of states (see postcopy_state) from
+ADVISE->DISCARD->LISTEN->RUNNING->END
+
+ - Advise
+
+ Set at the start of migration if postcopy is enabled, even
+ if it hasn't had the start command; here the destination
+ checks that its OS has the support needed for postcopy, and performs
+ setup to ensure the RAM mappings are suitable for later postcopy.
+ The destination will fail early in migration at this point if the
+ required OS support is not present.
+ (Triggered by reception of POSTCOPY_ADVISE command)
+
+ - Discard
+
+ Entered on receipt of the first 'discard' command; prior to
+ the first Discard being performed, hugepages are switched off
+ (using madvise) to ensure that no new huge pages are created
+ during the postcopy phase, and to cause any huge pages that
+ have discards on them to be broken.
+
+ - Listen
+
+ The first command in the package, POSTCOPY_LISTEN, switches
+ the destination state to Listen, and starts a new thread
+ (the 'listen thread') which takes over the job of receiving
+ pages off the migration stream, while the main thread carries
+ on processing the blob. With this thread able to process page
+ reception, the destination now 'sensitises' the RAM to detect
+ any access to missing pages (on Linux using the 'userfault'
+ system).
+
+ - Running
+
+ POSTCOPY_RUN causes the destination to synchronise all
+ state and start the CPUs and IO devices running. The main
+ thread now finishes processing the migration package and
+ now carries on as it would for normal precopy migration
+ (although it can't do the cleanup it would do as it
+ finishes a normal migration).
+
+ - Paused
+
+ Postcopy can run into a paused state (normally on both sides when
+ happens), where all threads will be temporarily halted mostly due to
+ network errors. When reaching paused state, migration will make sure
+ the qemu binary on both sides maintain the data without corrupting
+ the VM. To continue the migration, the admin needs to fix the
+ migration channel using the QMP command 'migrate-recover' on the
+ destination node, then resume the migration using QMP command 'migrate'
+ again on source node, with resume=true flag set.
+
+ - End
+
+ The listen thread can now quit, and perform the cleanup of migration
+ state, the migration is now complete.
+
+Device transfer
+---------------
Loading of device data may cause the device emulation to access guest RAM
that may trigger faults that have to be resolved by the source, as such
@@ -130,7 +196,20 @@ processing.
is no longer used by migration, while the listen thread carries on servicing
page data until the end of migration.
-Postcopy Recovery
+Source side page bitmap
+-----------------------
+
+The 'migration bitmap' in postcopy is basically the same as in the precopy,
+where each of the bit to indicate that page is 'dirty' - i.e. needs
+sending. During the precopy phase this is updated as the CPU dirties
+pages, however during postcopy the CPUs are stopped and nothing should
+dirty anything any more. Instead, dirty bits are cleared when the relevant
+pages are sent during postcopy.
+
+Postcopy features
+=================
+
+Postcopy recovery
-----------------
Comparing to precopy, postcopy is special on error handlings. When any
@@ -166,76 +245,6 @@ configurations of the guest. For example, when with async page fault
enabled, logically the guest can proactively schedule out the threads
accessing missing pages.
-Postcopy states
----------------
-
-Postcopy moves through a series of states (see postcopy_state) from
-ADVISE->DISCARD->LISTEN->RUNNING->END
-
- - Advise
-
- Set at the start of migration if postcopy is enabled, even
- if it hasn't had the start command; here the destination
- checks that its OS has the support needed for postcopy, and performs
- setup to ensure the RAM mappings are suitable for later postcopy.
- The destination will fail early in migration at this point if the
- required OS support is not present.
- (Triggered by reception of POSTCOPY_ADVISE command)
-
- - Discard
-
- Entered on receipt of the first 'discard' command; prior to
- the first Discard being performed, hugepages are switched off
- (using madvise) to ensure that no new huge pages are created
- during the postcopy phase, and to cause any huge pages that
- have discards on them to be broken.
-
- - Listen
-
- The first command in the package, POSTCOPY_LISTEN, switches
- the destination state to Listen, and starts a new thread
- (the 'listen thread') which takes over the job of receiving
- pages off the migration stream, while the main thread carries
- on processing the blob. With this thread able to process page
- reception, the destination now 'sensitises' the RAM to detect
- any access to missing pages (on Linux using the 'userfault'
- system).
-
- - Running
-
- POSTCOPY_RUN causes the destination to synchronise all
- state and start the CPUs and IO devices running. The main
- thread now finishes processing the migration package and
- now carries on as it would for normal precopy migration
- (although it can't do the cleanup it would do as it
- finishes a normal migration).
-
- - Paused
-
- Postcopy can run into a paused state (normally on both sides when
- happens), where all threads will be temporarily halted mostly due to
- network errors. When reaching paused state, migration will make sure
- the qemu binary on both sides maintain the data without corrupting
- the VM. To continue the migration, the admin needs to fix the
- migration channel using the QMP command 'migrate-recover' on the
- destination node, then resume the migration using QMP command 'migrate'
- again on source node, with resume=true flag set.
-
- - End
-
- The listen thread can now quit, and perform the cleanup of migration
- state, the migration is now complete.
-
-Source side page map
---------------------
-
-The 'migration bitmap' in postcopy is basically the same as in the precopy,
-where each of the bit to indicate that page is 'dirty' - i.e. needs
-sending. During the precopy phase this is updated as the CPU dirties
-pages, however during postcopy the CPUs are stopped and nothing should
-dirty anything any more. Instead, dirty bits are cleared when the relevant
-pages are sent during postcopy.
-
Postcopy with hugepages
-----------------------
@@ -293,7 +302,7 @@ Retro-fitting postcopy to existing clients is possible:
guest memory access is made while holding a lock then all other
threads waiting for that lock will also be blocked.
-Postcopy Preemption Mode
+Postcopy preemption mode
------------------------
Postcopy preempt is a new capability introduced in 8.0 QEMU release, it
--
2.43.0