... | ... | ||
---|---|---|---|
6 | albeit with new virtual addresses in new QEMU, and by preserving device | 6 | albeit with new virtual addresses in new QEMU, and by preserving device |
7 | file descriptors. | 7 | file descriptors. |
8 | 8 | ||
9 | The new user-visible interfaces are: | 9 | The new user-visible interfaces are: |
10 | * cpr-transfer (MigMode migration parameter) | 10 | * cpr-transfer (MigMode migration parameter) |
11 | * cpr-uri (migration parameter) | 11 | * cpr (MigrationChannelType) |
12 | * cpr-uri (command-line argument) | 12 | * incoming MigrationChannel (command-line argument) |
13 | * aux-ram-share (machine option) | ||
13 | 14 | ||
14 | The user sets the mode parameter before invoking the migrate command. | 15 | The user sets the mode parameter before invoking the migrate command. |
15 | In this mode, the user starts new QEMU on the same host as old QEMU, with | 16 | In this mode, the user starts new QEMU on the same host as old QEMU, with |
16 | the same arguments as old QEMU, plus the -incoming and the -cpr-uri options. | 17 | the same arguments as old QEMU, plus two -incoming options; one for the main |
17 | The user issues the migrate command to old QEMU, which stops the VM, saves | 18 | channel, and one for the CPR channel. The user issues the migrate command to |
18 | state to the migration channels, and enters the postmigrate state. Execution | 19 | old QEMU, which stops the VM, saves state to the migration channels, and |
19 | resumes in new QEMU. | 20 | enters the postmigrate state. Execution resumes in new QEMU. |
20 | 21 | ||
21 | Memory-backend objects must have the share=on attribute, but memory-backend-epc | 22 | Memory-backend objects must have the share=on attribute, but memory-backend-epc |
22 | and memory-backend-ram are not supported. The VM must be started with the | 23 | is not supported. The VM must be started with the '-machine aux-ram-share=on' |
23 | '-machine anon-alloc=memfd' option, which allows anonymous memory to be | 24 | option, which allows auxilliary guest memory to be transferred in place to the |
24 | transferred in place to the new process. | 25 | new process. |
25 | 26 | ||
26 | This mode requires a second migration channel, specified by the cpr-uri | 27 | This mode requires a second migration channel of type "cpr", in the channel |
27 | migration property on the outgoing side, and by the cpr-uri QEMU command-line | 28 | arguments on the outgoing side, and in a second -incoming command-line |
28 | option on the incoming side. The channel must be a type, such as unix socket, | 29 | parameter on the incoming side. This CPR channel must support file descriptor |
29 | that supports SCM_RIGHTS. | 30 | transfer with SCM_RIGHTS, i.e. it must be a UNIX domain socket. |
30 | 31 | ||
31 | Why? | 32 | Why? |
32 | 33 | ||
33 | This mode has less impact on the guest than any other method of updating | 34 | This mode has less impact on the guest than any other method of updating |
34 | in place. The pause time is much lower, because devices need not be torn | 35 | in place. The pause time is much lower, because devices need not be torn |
... | ... | ||
55 | Anonymous memory must be allocated using memfd_create rather than MAP_ANON, | 56 | Anonymous memory must be allocated using memfd_create rather than MAP_ANON, |
56 | so the memfd's can be sent to new QEMU. Pages that were locked in memory | 57 | so the memfd's can be sent to new QEMU. Pages that were locked in memory |
57 | for DMA in old QEMU remain locked in new QEMU, because the descriptor of | 58 | for DMA in old QEMU remain locked in new QEMU, because the descriptor of |
58 | the device that locked them remains open. | 59 | the device that locked them remains open. |
59 | 60 | ||
60 | cpr-transfer preserves descriptors by sending them to new QEMU via the | 61 | cpr-transfer preserves descriptors by sending them to new QEMU via the CPR |
61 | cpr-uri, which must support SCM_RIGHTS, and by sending the unique name | 62 | channel, which must support SCM_RIGHTS, and by sending the unique name of |
62 | and value of each descriptor to new QEMU via CPR state. | 63 | each descriptor to new QEMU via CPR state. |
63 | 64 | ||
64 | For device descriptors, new QEMU reuses the descriptor when creating the | 65 | For device descriptors, new QEMU reuses the descriptor when creating the |
65 | device, rather than opening it again. For memfd descriptors, new QEMU | 66 | device, rather than opening it again. For memfd descriptors, new QEMU |
66 | mmap's the preserved memfd when a ramblock is created. | 67 | mmap's the preserved memfd when a ramblock is created. |
67 | 68 | ||
68 | CPR state cannot be sent over the normal migration channel, because devices | 69 | CPR state cannot be sent over the normal migration channel, because devices |
69 | and backends are created prior to reading the channel, so this mode sends | 70 | and backends are created prior to reading the channel, so this mode sends |
70 | CPR state over a second migration channel, specified by cpr-uri. New QEMU | 71 | CPR state over a second "cpr" migration channel. New QEMU reads the second |
71 | reads the second channel prior to creating devices or backends. | 72 | channel prior to creating devices or backends. |
72 | 73 | ||
73 | Example: | 74 | Example: |
74 | 75 | ||
75 | In this example, we simply restart the same version of QEMU, but in | 76 | In this example, we simply restart the same version of QEMU, but in |
76 | a real scenario one would use a new QEMU binary path in terminal 2. | 77 | a real scenario one would use a new QEMU binary path in terminal 2. |
77 | 78 | ||
78 | Terminal 1: start old QEMU | 79 | Terminal 1: start old QEMU |
79 | # qemu-kvm -monitor stdio -object | 80 | # qemu-kvm -qmp stdio -object |
80 | memory-backend-file,id=ram0,size=4G,mem-path=/dev/shm/ram0,share=on | 81 | memory-backend-file,id=ram0,size=4G,mem-path=/dev/shm/ram0,share=on |
81 | -m 4G -machine anon-alloc=memfd ... | 82 | -m 4G -machine aux-ram-share=on ... |
82 | 83 | ||
83 | Terminal 2: start new QEMU | 84 | Terminal 2: start new QEMU |
84 | # qemu-kvm ... -incoming unix:vm.sock -cpr-uri unix:cpr.sock | 85 | # qemu-kvm -monitor stdio ... -incoming tcp:0:44444 |
86 | -incoming '{"channel-type": "cpr", | ||
87 | "addr": { "transport": "socket", "type": "unix", | ||
88 | "path": "cpr.sock"}}' | ||
85 | 89 | ||
86 | Terminal 1: | 90 | Terminal 1: |
87 | QEMU 9.1.50 monitor - type 'help' for more information | 91 | {"execute":"qmp_capabilities"} |
88 | (qemu) info status | 92 | |
89 | VM status: running | 93 | {"execute": "query-status"} |
90 | (qemu) migrate_set_parameter mode cpr-transfer | 94 | {"return": {"status": "running", |
91 | (qemu) migrate_set_parameter cpr-uri unix:cpr.sock | 95 | "running": true}} |
92 | (qemu) migrate -d unix:vm.sock | 96 | |
93 | (qemu) info status | 97 | {"execute":"migrate-set-parameters", |
94 | VM status: paused (postmigrate) | 98 | "arguments":{"mode":"cpr-transfer"}} |
99 | |||
100 | {"execute": "migrate", "arguments": { "channels": [ | ||
101 | {"channel-type": "main", | ||
102 | "addr": { "transport": "socket", "type": "inet", | ||
103 | "host": "0", "port": "44444" }}, | ||
104 | {"channel-type": "cpr", | ||
105 | "addr": { "transport": "socket", "type": "unix", | ||
106 | "path": "cpr.sock" }}]}} | ||
107 | |||
108 | {"execute": "query-status"} | ||
109 | {"return": {"status": "postmigrate", | ||
110 | "running": false}} | ||
95 | 111 | ||
96 | Terminal 2: | 112 | Terminal 2: |
97 | QEMU 9.1.50 monitor - type 'help' for more information | 113 | QEMU 10.0.50 monitor - type 'help' for more information |
98 | (qemu) info status | 114 | (qemu) info status |
99 | VM status: running | 115 | VM status: running |
100 | 116 | ||
101 | This patch series implements a minimal version of cpr-transfer. Additional | 117 | This patch series implements a minimal version of cpr-transfer. Additional |
102 | series are ready to be posted to deliver the complete vision described | 118 | series are ready to be posted to deliver the complete vision described |
... | ... | ||
116 | * addressed misc review comments | 132 | * addressed misc review comments |
117 | 133 | ||
118 | Changes in V3: | 134 | Changes in V3: |
119 | * added cpr-transfer to migration-test | 135 | * added cpr-transfer to migration-test |
120 | * documented cpr-transfer in CPR.rst | 136 | * documented cpr-transfer in CPR.rst |
121 | * fixed size_t trace format for 32-bit build | 137 | * fix size_t trace format for 32-bit build |
122 | * dropped explicit fd value in VMSTATE_FD | 138 | * drop explicit fd value in VMSTATE_FD |
123 | * deferred cpr_walk_fd() and cpr_resave_fd() to later series | 139 | * defer cpr_walk_fd() and cpr_resave_fd() to later series |
124 | * dropped "migration: save cpr mode". | 140 | * drop "migration: save cpr mode". |
125 | deleted mode from cpr state, and used cpr_uri to infer transfer mode. | 141 | delete mode from cpr state, and use cpr_uri to infer transfer mode. |
126 | * dropped "migration: stop vm earlier for cpr" | 142 | * drop "migration: stop vm earlier for cpr" |
143 | * dropped cpr helpers, to be re-added later when needed | ||
127 | * fixed an unreported bug for cpr-transfer and migrate cancel | 144 | * fixed an unreported bug for cpr-transfer and migrate cancel |
128 | * documented cpr-transfer restrictions in qapi | 145 | * documented cpr-transfer restrictions in qapi |
129 | * added trace for cpr_state_save and cpr_state_load | 146 | * added trace for cpr_state_save and cpr_state_load |
130 | * added ftruncate to "preserve ram blocks" | 147 | * added ftruncate to "preserve ram blocks" |
131 | 148 | ||
132 | The first 4 patches below are foundational and are needed for both cpr-transfer | 149 | Changes in V4: |
133 | mode and the proposed cpr-exec mode. The next 7 patches are specific to | 150 | * cleaned up qtest deferred connection code |
151 | * renamed pass_fd -> can_pass_fd | ||
152 | * squashed patch "split qmp_migrate" | ||
153 | * deleted cpr-uri and its patches | ||
154 | * added cpr channel and its patches | ||
155 | * added patch "hostmem-shm: preserve for cpr" | ||
156 | * added patch "fd-based shared memory" | ||
157 | * added patch "factor out allocation of anonymous shared memory" | ||
158 | * added RAM_PRIVATE and its patch | ||
159 | * added aux-ram-share and its patch | ||
160 | |||
161 | Changes in V5: | ||
162 | * added patch 'enhance migrate_uri_parse' | ||
163 | * supported dotted keys for -incoming channel, | ||
164 | and rewrote incoming_option_parse | ||
165 | * moved migrate_fd_cancel -> vm_resume to "stop vm earlier for cpr" | ||
166 | in a future series. | ||
167 | * updated command-line definition for aux-ram-share | ||
168 | * added patch "resizable qemu_ram_alloc_from_fd" | ||
169 | * rewrote patch "fd-based shared memory" | ||
170 | * fixed error message in qemu_shm_alloc | ||
171 | * added patch 'tests/qtest: optimize migrate_set_ports' | ||
172 | * added patch 'tests/qtest: enhance migration channels' | ||
173 | * added patch 'tests/qtest: assert qmp_ready' | ||
174 | * modified patch 'migration-test: cpr-transfer' | ||
175 | * polished the documentation in CPR.rst, qapi, and the | ||
176 | cpr-transfer mode commit message | ||
177 | * updated to master, and resolved massive context diffs for migration tests | ||
178 | |||
179 | Changes in V6: | ||
180 | * added RB's and Acks. | ||
181 | * in patch "assert qmp_ready", deleted qmp_ready and checked qmp_fd instead. | ||
182 | renamed patch to ""assert qmp connected" | ||
183 | * factored out fix into new patch | ||
184 | "fix qemu_ram_alloc_from_fd size calculation" | ||
185 | * deleted a redundant call to migrate_hup_delete | ||
186 | * added commit message to "migration: cpr-transfer documentation" | ||
187 | * polished the text of cpr-transfer mode in qapi | ||
188 | |||
189 | Changes in V7: | ||
190 | * fixed cpr-transfer test failure for s390 | ||
191 | * fixed machine_get_aux_ram_share compilation error for Windows | ||
192 | * fixed size_t print format compilation error for misc architectures | ||
193 | * fixed memory leaks in cpr_transfer_output, cpr_transfer_input, and | ||
194 | qemu_file_get_fd | ||
195 | |||
196 | The first 10 patches below are foundational and are needed for both cpr-transfer | ||
197 | mode and the proposed cpr-exec mode. The next 6 patches are specific to | ||
134 | cpr-transfer and implement the mechanisms for sharing state across a socket | 198 | cpr-transfer and implement the mechanisms for sharing state across a socket |
135 | using SCM_RIGHTS. The last 5 patches supply tests and documentation. | 199 | using SCM_RIGHTS. The last 8 patches supply tests and documentation. |
136 | 200 | ||
137 | Steve Sistare (16): | 201 | Steve Sistare (24): |
138 | machine: anon-alloc option | 202 | backends/hostmem-shm: factor out allocation of "anonymous shared |
203 | memory with an fd" | ||
204 | physmem: fix qemu_ram_alloc_from_fd size calculation | ||
205 | physmem: qemu_ram_alloc_from_fd extensions | ||
206 | physmem: fd-based shared memory | ||
207 | memory: add RAM_PRIVATE | ||
208 | machine: aux-ram-share option | ||
139 | migration: cpr-state | 209 | migration: cpr-state |
140 | physmem: preserve ram blocks for cpr | 210 | physmem: preserve ram blocks for cpr |
141 | hostmem-memfd: preserve for cpr | 211 | hostmem-memfd: preserve for cpr |
212 | hostmem-shm: preserve for cpr | ||
213 | migration: enhance migrate_uri_parse | ||
214 | migration: incoming channel | ||
142 | migration: SCM_RIGHTS for QEMUFile | 215 | migration: SCM_RIGHTS for QEMUFile |
143 | migration: VMSTATE_FD | 216 | migration: VMSTATE_FD |
144 | migration: cpr-transfer save and load | 217 | migration: cpr-transfer save and load |
145 | migration: cpr-uri parameter | ||
146 | migration: cpr-uri option | ||
147 | migration: split qmp_migrate | ||
148 | migration: cpr-transfer mode | 218 | migration: cpr-transfer mode |
149 | tests/migration-test: memory_backend | 219 | migration-test: memory_backend |
220 | tests/qtest: optimize migrate_set_ports | ||
150 | tests/qtest: defer connection | 221 | tests/qtest: defer connection |
151 | tests/migration-test: defer connection | 222 | migration-test: defer connection |
223 | tests/qtest: enhance migration channels | ||
224 | tests/qtest: assert qmp connected | ||
152 | migration-test: cpr-transfer | 225 | migration-test: cpr-transfer |
153 | migration: cpr-transfer documentation | 226 | migration: cpr-transfer documentation |
154 | 227 | ||
155 | backends/hostmem-memfd.c | 12 ++- | 228 | backends/hostmem-epc.c | 2 +- |
156 | docs/devel/migration/CPR.rst | 144 +++++++++++++++++++++++++- | 229 | backends/hostmem-file.c | 2 +- |
157 | hw/core/machine.c | 19 ++++ | 230 | backends/hostmem-memfd.c | 14 ++- |
158 | include/hw/boards.h | 1 + | 231 | backends/hostmem-ram.c | 2 +- |
159 | include/migration/cpr.h | 29 ++++++ | 232 | backends/hostmem-shm.c | 51 ++------ |
160 | include/migration/vmstate.h | 9 ++ | 233 | docs/devel/migration/CPR.rst | 182 ++++++++++++++++++++++++++- |
161 | migration/cpr-transfer.c | 81 +++++++++++++++ | 234 | hw/core/machine.c | 22 ++++ |
162 | migration/cpr.c | 223 +++++++++++++++++++++++++++++++++++++++++ | 235 | include/exec/memory.h | 10 ++ |
163 | migration/meson.build | 2 + | 236 | include/exec/ram_addr.h | 13 +- |
164 | migration/migration-hmp-cmds.c | 10 ++ | 237 | include/hw/boards.h | 1 + |
165 | migration/migration.c | 107 +++++++++++++++++++- | 238 | include/migration/cpr.h | 33 +++++ |
166 | migration/migration.h | 2 + | 239 | include/migration/misc.h | 7 ++ |
167 | migration/options.c | 40 +++++++- | 240 | include/migration/vmstate.h | 9 ++ |
168 | migration/options.h | 1 + | 241 | include/qemu/osdep.h | 1 + |
169 | migration/qemu-file.c | 83 ++++++++++++++- | 242 | meson.build | 8 +- |
170 | migration/qemu-file.h | 2 + | 243 | migration/cpr-transfer.c | 71 +++++++++++ |
171 | migration/ram.c | 2 + | 244 | migration/cpr.c | 224 +++++++++++++++++++++++++++++++++ |
172 | migration/trace-events | 9 ++ | 245 | migration/meson.build | 2 + |
173 | migration/vmstate-types.c | 24 +++++ | 246 | migration/migration.c | 139 +++++++++++++++++++- |
174 | qapi/machine.json | 14 +++ | 247 | migration/migration.h | 4 +- |
175 | qapi/migration.json | 53 +++++++++- | 248 | migration/options.c | 8 +- |
176 | qemu-options.hx | 19 ++++ | 249 | migration/qemu-file.c | 84 ++++++++++++- |
177 | stubs/vmstate.c | 7 ++ | 250 | migration/qemu-file.h | 2 + |
178 | system/physmem.c | 63 ++++++++++++ | 251 | migration/ram.c | 2 + |
179 | system/trace-events | 3 + | 252 | migration/trace-events | 11 ++ |
180 | system/vl.c | 10 ++ | 253 | migration/vmstate-types.c | 24 ++++ |
181 | tests/qtest/libqtest.c | 69 ++++++++----- | 254 | qapi/migration.json | 44 ++++++- |
182 | tests/qtest/libqtest.h | 19 +++- | 255 | qemu-options.hx | 34 +++++ |
183 | tests/qtest/migration-test.c | 107 ++++++++++++++++++-- | 256 | stubs/vmstate.c | 7 ++ |
184 | 29 files changed, 1115 insertions(+), 49 deletions(-) | 257 | system/memory.c | 4 +- |
258 | system/physmem.c | 150 ++++++++++++++++++---- | ||
259 | system/trace-events | 1 + | ||
260 | system/vl.c | 43 ++++++- | ||
261 | tests/qtest/libqtest.c | 86 ++++++++----- | ||
262 | tests/qtest/libqtest.h | 19 ++- | ||
263 | tests/qtest/migration/cpr-tests.c | 62 +++++++++ | ||
264 | tests/qtest/migration/framework.c | 74 +++++++++-- | ||
265 | tests/qtest/migration/framework.h | 11 ++ | ||
266 | tests/qtest/migration/migration-qmp.c | 53 ++++++-- | ||
267 | tests/qtest/migration/migration-qmp.h | 10 +- | ||
268 | tests/qtest/migration/migration-util.c | 23 ++-- | ||
269 | tests/qtest/migration/misc-tests.c | 9 +- | ||
270 | tests/qtest/migration/precopy-tests.c | 6 +- | ||
271 | tests/qtest/virtio-net-failover.c | 8 +- | ||
272 | util/memfd.c | 16 ++- | ||
273 | util/oslib-posix.c | 52 ++++++++ | ||
274 | util/oslib-win32.c | 6 + | ||
275 | 47 files changed, 1472 insertions(+), 174 deletions(-) | ||
185 | create mode 100644 include/migration/cpr.h | 276 | create mode 100644 include/migration/cpr.h |
186 | create mode 100644 migration/cpr-transfer.c | 277 | create mode 100644 migration/cpr-transfer.c |
187 | create mode 100644 migration/cpr.c | 278 | create mode 100644 migration/cpr.c |
188 | 279 | ||
280 | base-commit: e8aa7fdcddfc8589bdc7c973a052e76e8f999455 | ||
281 | |||
189 | -- | 282 | -- |
190 | 1.8.3.1 | 283 | 1.8.3.1 | diff view generated by jsdifflib |