1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | 1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> |
---|---|---|---|
2 | 2 | ||
3 | This is an updated v5 patch series of the v4 series located here: | 3 | This is an updated v6 patch series of the v5 series located here: |
4 | https://lore.kernel.org/qemu-devel/cover.1738171076.git.maciej.szmigiero@oracle.com/ | 4 | https://lore.kernel.org/qemu-devel/cover.1739994627.git.maciej.szmigiero@oracle.com/ |
5 | 5 | ||
6 | Changes from v3: | 6 | What this patch set is about? |
7 | * Use "unsigned long" for VFIO bytes transferred counter to fixes | 7 | Current live migration device state transfer is done via the main (single) |
8 | test issues on 32-bit platforms. | 8 | migration channel, which reduces performance and severally impacts the |
9 | migration downtime for VMs having large device state that needs to be | ||
10 | transferred during the switchover phase. | ||
9 | 11 | ||
10 | * Instead of adding BQL holding around qemu_loadvm_state_main() in | 12 | Example devices that have such large switchover phase device state are some |
11 | postcopy_ram_listen_thread() just add a TODO comment there. | 13 | types of VFIO SmartNICs and GPUs. |
12 | 14 | ||
13 | * Drop patches for gracefully handling improperly terminated TLS session | 15 | This patch set allows parallelizing this transfer by using multifd channels |
14 | and rely on recent Fabiano's changes to handle this case instead. | 16 | for it. |
17 | It also introduces new load and save threads per VFIO device for decoupling | ||
18 | these operations from the main migration thread. | ||
19 | These threads run on newly introduced generic (non-AIO) thread pools, | ||
20 | instantiated by the core migration core. | ||
15 | 21 | ||
16 | * Adapt how MULTIFD_FLAG_DEVICE_STATE value is defined for consistency with | 22 | Changes from v5: |
17 | neighboring flags. | 23 | * Add bql_locked() assertion to migration_incoming_state_destroy() with a |
24 | comment describing why holding BQL there is necessary. | ||
18 | 25 | ||
19 | * Return Error type and use migrate_set_error() to set it also for save | 26 | * Add SPDX-License-Identifier to newly added files. |
20 | threads, much like it was previously done for load threads. | ||
21 | 27 | ||
22 | * Export SaveLiveCompletePrecopyThreadData and make save threads | 28 | * Move consistency of multfd transfer settings check to the patch adding |
23 | take it directly instead of passing individual parameters stored there | 29 | x-migration-multifd-transfer property. |
24 | to a thread entry point. | ||
25 | 30 | ||
26 | * Group all multifd device state save variables in | 31 | * Change packet->idx == UINT32_MAX message to the suggested one. |
27 | multifd_send_device_state variable allocated on demand instead of | ||
28 | using multifd-device-state.c globals. | ||
29 | 32 | ||
30 | * Export save threads abort flag via | 33 | * Use WITH_QEMU_LOCK_GUARD() in vfio_load_state_buffer(). |
31 | multifd_device_state_save_thread_should_exit() getter function rather | ||
32 | than passing it directly. | ||
33 | 34 | ||
34 | * Separate VFIO multifd stuff into migration-multifd.{c,h} files. | 35 | * Add vfio_load_bufs_thread_{start,end} trace events. |
35 | Needed moving VFIO migration channel flags to vfio-common.h header file. | ||
36 | 36 | ||
37 | * Move x-migration-load-config-after-iter feature to a separate patch near | 37 | * Invert "ret" value computation logic in vfio_load_bufs_thread() and |
38 | the end of the series. | 38 | vfio_multifd_save_complete_precopy_thread() - initialize "ret" to false |
39 | at definition, remove "ret = false" at every failure/early exit block and | ||
40 | add "ret = true" just before the early exit jump label. | ||
39 | 41 | ||
40 | * Move x-migration-max-queued-buffers feature to a yet another separate | 42 | * Make vfio_load_bufs_thread_load_config() return a bool and take an |
41 | patch near the end of the series. | 43 | "Error **" parameter. |
42 | 44 | ||
43 | * Introduce save/load common vfio_multifd_transfer_setup() and a getter | 45 | * Make vfio_multifd_setup() (previously called vfio_multifd_transfer_setup()) |
44 | function for multifd transfer switch called vfio_multifd_transfer_enabled(). | 46 | allocate struct VFIOMultifd if requested by "alloc_multifd" parameter. |
45 | 47 | ||
46 | * Move introduction of VFIOMigration->multifd_transfer and | 48 | * Add vfio_multifd_cleanup() call to vfio_save_cleanup() (for consistency |
47 | VFIODevice->migration_multifd_transfer into the very patch that introduces | 49 | with the load code), with a comment describing that it is currently a NOP |
48 | the x-migration-multifd-transfer property. | 50 | there. |
49 | 51 | ||
50 | * Introduce vfio_multifd_cleanup() for clearing migration->multifd. | 52 | * Move vfio_multifd_cleanup() to migration-multifd.c. |
51 | 53 | ||
52 | * Split making x-migration-multifd-transfer mutable at runtime into | 54 | * Move general multifd migration description in docs/devel/migration/vfio.rst |
53 | a separate patch. | 55 | from the top section to new "Multifd" section at the bottom. |
54 | 56 | ||
55 | * Rename vfio_switchover_start() to vfio_multifd_switchover_start() and | 57 | * Add comment describing why x-migration-multifd-transfer needs to be |
56 | add a new vfio_switchover_start() in migration.c that calls that | 58 | a custom property above the variable containing that custom property type |
57 | vfio_multifd_switchover_start(). | 59 | in register_vfio_pci_dev_type(). |
58 | 60 | ||
59 | * Introduce VFIO_DEVICE_STATE_PACKET_VER_CURRENT. | 61 | * Add object_class_property_set_description() description for all 3 newly |
62 | added parameters: x-migration-multifd-transfer, | ||
63 | x-migration-load-config-after-iter and x-migration-max-queued-buffers. | ||
60 | 64 | ||
61 | * Don't print UINT32_MAX value in vfio_load_state_buffer(). | 65 | * Split out wiring vfio_multifd_setup() and vfio_multifd_cleanup() into |
66 | general VFIO load/save setup and cleanup methods into a brand new | ||
67 | patch/commit. | ||
62 | 68 | ||
63 | * Introduce a new routine for parsing VFIO_MIG_FLAG_DEV_CONFIG_LOAD_READY. | 69 | * Squash the patch introducing VFIOStateBuffer(s) into the "received buffers |
70 | queuing" commit to fix building the interim code form at the time of this | ||
71 | patch with "-Werror". | ||
72 | |||
73 | * Change device state packet "idstr" field to NULL-terminated and drop | ||
74 | QEMU_NONSTRING marking from its definition. | ||
64 | 75 | ||
65 | * Add an Error parameter to vfio_save_complete_precopy_thread_config_state() | 76 | * Add vbasedev->name to VFIO error messages to know which device caused |
66 | and propagate it from vfio_save_device_config_state() function that it calls. | 77 | that error. |
67 | 78 | ||
68 | * Update the VFIO developer doc in docs/devel/migration/vfio.rst. | 79 | * Move BQL lock ordering assert closer to the other lock in the lock order |
80 | in vfio_load_state_buffer(). | ||
69 | 81 | ||
70 | * Add comments about how VFIO multifd threads are launched and from where | 82 | * Drop orphan "QemuThread load_bufs_thread" VFIOMultifd member leftover |
71 | this happens. Also add comments how they are terminated. | 83 | from the days of the version 2 of this patch set. |
72 | 84 | ||
73 | * Other small changes, like renamed functions, added review tags, code | 85 | * Change "guint" into an "unsigned int" where it was present in this |
74 | formatting, rebased on top of the latest QEMU git master, etc. | 86 | patch set. |
87 | |||
88 | * Use g_autoptr() for QEMUFile also in vfio_load_bufs_thread_load_config(). | ||
89 | |||
90 | * Call multifd_abort_device_state_save_threads() if a migration error is | ||
91 | already set in the save path to avoid needlessly waiting for the remaining | ||
92 | threads to do all of their normal work. | ||
93 | |||
94 | * Other minor changes that should not have functional impact, like: | ||
95 | renamed functions/labels, moved code lines between patches contained | ||
96 | in this patch set, added review tags, code formatting, rebased on top | ||
97 | of the latest QEMU git master, etc. | ||
75 | 98 | ||
76 | ======================================================================== | 99 | ======================================================================== |
77 | 100 | ||
78 | This patch set is targeting QEMU 10.0. | 101 | This patch set is targeting QEMU 10.0. |
102 | |||
103 | It is also exported as a git tree: | ||
104 | https://gitlab.com/maciejsszmigiero/qemu/-/commits/multifd-device-state-transfer-vfio | ||
79 | 105 | ||
80 | ======================================================================== | 106 | ======================================================================== |
81 | 107 | ||
82 | Maciej S. Szmigiero (35): | 108 | Maciej S. Szmigiero (35): |
83 | migration: Clarify that {load,save}_cleanup handlers can run without | 109 | migration: Clarify that {load,save}_cleanup handlers can run without |
... | ... | ||
102 | vfio/migration: Convert bytes_transferred counter to atomic | 128 | vfio/migration: Convert bytes_transferred counter to atomic |
103 | vfio/migration: Add vfio_add_bytes_transferred() | 129 | vfio/migration: Add vfio_add_bytes_transferred() |
104 | vfio/migration: Move migration channel flags to vfio-common.h header | 130 | vfio/migration: Move migration channel flags to vfio-common.h header |
105 | file | 131 | file |
106 | vfio/migration: Multifd device state transfer support - basic types | 132 | vfio/migration: Multifd device state transfer support - basic types |
107 | vfio/migration: Multifd device state transfer support - | ||
108 | VFIOStateBuffer(s) | ||
109 | vfio/migration: Multifd device state transfer - add support checking | 133 | vfio/migration: Multifd device state transfer - add support checking |
110 | function | 134 | function |
111 | vfio/migration: Multifd device state transfer support - receive | 135 | vfio/migration: Multifd setup/cleanup functions and associated |
112 | init/cleanup | 136 | VFIOMultifd |
137 | vfio/migration: Setup and cleanup multifd transfer in these general | ||
138 | methods | ||
113 | vfio/migration: Multifd device state transfer support - received | 139 | vfio/migration: Multifd device state transfer support - received |
114 | buffers queuing | 140 | buffers queuing |
115 | vfio/migration: Multifd device state transfer support - load thread | 141 | vfio/migration: Multifd device state transfer support - load thread |
142 | migration/qemu-file: Define g_autoptr() cleanup function for QEMUFile | ||
116 | vfio/migration: Multifd device state transfer support - config loading | 143 | vfio/migration: Multifd device state transfer support - config loading |
117 | support | 144 | support |
118 | migration/qemu-file: Define g_autoptr() cleanup function for QEMUFile | ||
119 | vfio/migration: Multifd device state transfer support - send side | 145 | vfio/migration: Multifd device state transfer support - send side |
120 | vfio/migration: Add x-migration-multifd-transfer VFIO property | 146 | vfio/migration: Add x-migration-multifd-transfer VFIO property |
121 | vfio/migration: Make x-migration-multifd-transfer VFIO property | 147 | vfio/migration: Make x-migration-multifd-transfer VFIO property |
122 | mutable | 148 | mutable |
123 | hw/core/machine: Add compat for x-migration-multifd-transfer VFIO | 149 | hw/core/machine: Add compat for x-migration-multifd-transfer VFIO |
... | ... | ||
127 | vfio/migration: Update VFIO migration documentation | 153 | vfio/migration: Update VFIO migration documentation |
128 | 154 | ||
129 | Peter Xu (1): | 155 | Peter Xu (1): |
130 | migration/multifd: Make MultiFDSendData a struct | 156 | migration/multifd: Make MultiFDSendData a struct |
131 | 157 | ||
132 | docs/devel/migration/vfio.rst | 80 ++- | 158 | docs/devel/migration/vfio.rst | 79 ++- |
133 | hw/core/machine.c | 2 + | 159 | hw/core/machine.c | 2 + |
134 | hw/vfio/meson.build | 1 + | 160 | hw/vfio/meson.build | 1 + |
135 | hw/vfio/migration-multifd.c | 757 +++++++++++++++++++++++++++++ | 161 | hw/vfio/migration-multifd.c | 786 +++++++++++++++++++++++++++++ |
136 | hw/vfio/migration-multifd.h | 38 ++ | 162 | hw/vfio/migration-multifd.h | 37 ++ |
137 | hw/vfio/migration.c | 119 +++-- | 163 | hw/vfio/migration.c | 111 ++-- |
138 | hw/vfio/pci.c | 14 + | 164 | hw/vfio/pci.c | 40 ++ |
139 | hw/vfio/trace-events | 11 +- | 165 | hw/vfio/trace-events | 13 +- |
140 | include/block/aio.h | 8 +- | 166 | include/block/aio.h | 8 +- |
141 | include/block/thread-pool.h | 62 ++- | 167 | include/block/thread-pool.h | 62 ++- |
142 | include/hw/vfio/vfio-common.h | 36 ++ | 168 | include/hw/vfio/vfio-common.h | 34 ++ |
143 | include/migration/client-options.h | 4 + | 169 | include/migration/client-options.h | 4 + |
144 | include/migration/misc.h | 25 + | 170 | include/migration/misc.h | 25 + |
145 | include/migration/register.h | 52 +- | 171 | include/migration/register.h | 52 +- |
146 | include/qapi/error.h | 2 + | 172 | include/qapi/error.h | 2 + |
147 | include/qemu/typedefs.h | 5 + | 173 | include/qemu/typedefs.h | 5 + |
148 | migration/colo.c | 3 + | 174 | migration/colo.c | 3 + |
149 | migration/meson.build | 1 + | 175 | migration/meson.build | 1 + |
150 | migration/migration-hmp-cmds.c | 2 + | 176 | migration/migration-hmp-cmds.c | 2 + |
151 | migration/migration.c | 4 +- | 177 | migration/migration.c | 20 +- |
152 | migration/migration.h | 7 + | 178 | migration/migration.h | 7 + |
153 | migration/multifd-device-state.c | 202 ++++++++ | 179 | migration/multifd-device-state.c | 212 ++++++++ |
154 | migration/multifd-nocomp.c | 30 +- | 180 | migration/multifd-nocomp.c | 30 +- |
155 | migration/multifd.c | 246 ++++++++-- | 181 | migration/multifd.c | 248 +++++++-- |
156 | migration/multifd.h | 74 ++- | 182 | migration/multifd.h | 74 ++- |
157 | migration/options.c | 9 + | 183 | migration/options.c | 9 + |
158 | migration/qemu-file.h | 2 + | 184 | migration/qemu-file.h | 2 + |
159 | migration/savevm.c | 190 +++++++- | 185 | migration/savevm.c | 201 +++++++- |
160 | migration/savevm.h | 6 +- | 186 | migration/savevm.h | 6 +- |
161 | migration/trace-events | 1 + | 187 | migration/trace-events | 1 + |
162 | scripts/analyze-migration.py | 11 + | 188 | scripts/analyze-migration.py | 11 + |
163 | tests/unit/test-thread-pool.c | 6 +- | 189 | tests/unit/test-thread-pool.c | 6 +- |
164 | util/async.c | 6 +- | 190 | util/async.c | 6 +- |
165 | util/thread-pool.c | 184 +++++-- | 191 | util/thread-pool.c | 184 +++++-- |
166 | util/trace-events | 6 +- | 192 | util/trace-events | 6 +- |
167 | 35 files changed, 2039 insertions(+), 167 deletions(-) | 193 | 35 files changed, 2125 insertions(+), 165 deletions(-) |
168 | create mode 100644 hw/vfio/migration-multifd.c | 194 | create mode 100644 hw/vfio/migration-multifd.c |
169 | create mode 100644 hw/vfio/migration-multifd.h | 195 | create mode 100644 hw/vfio/migration-multifd.h |
170 | create mode 100644 migration/multifd-device-state.c | 196 | create mode 100644 migration/multifd-device-state.c | diff view generated by jsdifflib |
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | 1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> |
---|---|---|---|
2 | 2 | ||
3 | Add VFIOStateBuffer(s) types and the associated methods. | 3 | There's already a max in-flight VFIO device state buffers *count* limit, |
4 | 4 | add also max queued buffers *size* limit. | |
5 | These store received device state buffers and config state waiting to get | ||
6 | loaded into the device. | ||
7 | 5 | ||
8 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | 6 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> |
9 | --- | 7 | --- |
10 | hw/vfio/migration-multifd.c | 54 +++++++++++++++++++++++++++++++++++++ | 8 | docs/devel/migration/vfio.rst | 8 +++++--- |
11 | 1 file changed, 54 insertions(+) | 9 | hw/vfio/migration-multifd.c | 21 +++++++++++++++++++-- |
10 | hw/vfio/pci.c | 9 +++++++++ | ||
11 | include/hw/vfio/vfio-common.h | 1 + | ||
12 | 4 files changed, 34 insertions(+), 5 deletions(-) | ||
12 | 13 | ||
14 | diff --git a/docs/devel/migration/vfio.rst b/docs/devel/migration/vfio.rst | ||
15 | index XXXXXXX..XXXXXXX 100644 | ||
16 | --- a/docs/devel/migration/vfio.rst | ||
17 | +++ b/docs/devel/migration/vfio.rst | ||
18 | @@ -XXX,XX +XXX,XX @@ This means that a malicious QEMU source could theoretically cause the target | ||
19 | QEMU to allocate unlimited amounts of memory for such buffers-in-flight. | ||
20 | |||
21 | The "x-migration-max-queued-buffers" property allows capping the maximum count | ||
22 | -of these VFIO device state buffers queued at the destination. | ||
23 | +of these VFIO device state buffers queued at the destination while | ||
24 | +"x-migration-max-queued-buffers-size" property allows capping their total queued | ||
25 | +size. | ||
26 | |||
27 | Because a malicious QEMU source causing OOM on the target is not expected to be | ||
28 | a realistic threat in most of VFIO live migration use cases and the right value | ||
29 | -depends on the particular setup by default this queued buffers limit is | ||
30 | -disabled by setting it to UINT64_MAX. | ||
31 | +depends on the particular setup by default these queued buffers limits are | ||
32 | +disabled by setting them to UINT64_MAX. | ||
33 | |||
34 | Some host platforms (like ARM64) require that VFIO device config is loaded only | ||
35 | after all iterables were loaded. | ||
13 | diff --git a/hw/vfio/migration-multifd.c b/hw/vfio/migration-multifd.c | 36 | diff --git a/hw/vfio/migration-multifd.c b/hw/vfio/migration-multifd.c |
14 | index XXXXXXX..XXXXXXX 100644 | 37 | index XXXXXXX..XXXXXXX 100644 |
15 | --- a/hw/vfio/migration-multifd.c | 38 | --- a/hw/vfio/migration-multifd.c |
16 | +++ b/hw/vfio/migration-multifd.c | 39 | +++ b/hw/vfio/migration-multifd.c |
17 | @@ -XXX,XX +XXX,XX @@ typedef struct VFIODeviceStatePacket { | 40 | @@ -XXX,XX +XXX,XX @@ typedef struct VFIOMultifd { |
18 | uint32_t flags; | 41 | uint32_t load_buf_idx; |
19 | uint8_t data[0]; | 42 | uint32_t load_buf_idx_last; |
20 | } QEMU_PACKED VFIODeviceStatePacket; | 43 | uint32_t load_buf_queued_pending_buffers; |
21 | + | 44 | + size_t load_buf_queued_pending_buffers_size; |
22 | +/* type safety */ | 45 | } VFIOMultifd; |
23 | +typedef struct VFIOStateBuffers { | 46 | |
24 | + GArray *array; | 47 | static void vfio_state_buffer_clear(gpointer data) |
25 | +} VFIOStateBuffers; | 48 | @@ -XXX,XX +XXX,XX @@ static bool vfio_load_state_buffer_insert(VFIODevice *vbasedev, |
26 | + | 49 | VFIOMigration *migration = vbasedev->migration; |
27 | +typedef struct VFIOStateBuffer { | 50 | VFIOMultifd *multifd = migration->multifd; |
28 | + bool is_present; | 51 | VFIOStateBuffer *lb; |
29 | + char *data; | 52 | + size_t data_size = packet_total_size - sizeof(*packet); |
30 | + size_t len; | 53 | |
31 | +} VFIOStateBuffer; | 54 | vfio_state_buffers_assert_init(&multifd->load_bufs); |
32 | + | 55 | if (packet->idx >= vfio_state_buffers_size_get(&multifd->load_bufs)) { |
33 | +static void vfio_state_buffer_clear(gpointer data) | 56 | @@ -XXX,XX +XXX,XX @@ static bool vfio_load_state_buffer_insert(VFIODevice *vbasedev, |
34 | +{ | 57 | return false; |
35 | + VFIOStateBuffer *lb = data; | 58 | } |
36 | + | 59 | |
37 | + if (!lb->is_present) { | 60 | - lb->data = g_memdup2(&packet->data, packet_total_size - sizeof(*packet)); |
38 | + return; | 61 | - lb->len = packet_total_size - sizeof(*packet); |
62 | + multifd->load_buf_queued_pending_buffers_size += data_size; | ||
63 | + if (multifd->load_buf_queued_pending_buffers_size > | ||
64 | + vbasedev->migration_max_queued_buffers_size) { | ||
65 | + error_setg(errp, | ||
66 | + "%s: queuing state buffer %" PRIu32 | ||
67 | + " would exceed the size max of %" PRIu64, | ||
68 | + vbasedev->name, packet->idx, | ||
69 | + vbasedev->migration_max_queued_buffers_size); | ||
70 | + return false; | ||
39 | + } | 71 | + } |
40 | + | 72 | + |
41 | + g_clear_pointer(&lb->data, g_free); | 73 | + lb->data = g_memdup2(&packet->data, data_size); |
42 | + lb->is_present = false; | 74 | + lb->len = data_size; |
43 | +} | 75 | lb->is_present = true; |
76 | |||
77 | return true; | ||
78 | @@ -XXX,XX +XXX,XX @@ static bool vfio_load_state_buffer_write(VFIODevice *vbasedev, | ||
79 | assert(wr_ret <= buf_len); | ||
80 | buf_len -= wr_ret; | ||
81 | buf_cur += wr_ret; | ||
44 | + | 82 | + |
45 | +static void vfio_state_buffers_init(VFIOStateBuffers *bufs) | 83 | + assert(multifd->load_buf_queued_pending_buffers_size >= wr_ret); |
46 | +{ | 84 | + multifd->load_buf_queued_pending_buffers_size -= wr_ret; |
47 | + bufs->array = g_array_new(FALSE, TRUE, sizeof(VFIOStateBuffer)); | 85 | } |
48 | + g_array_set_clear_func(bufs->array, vfio_state_buffer_clear); | 86 | |
49 | +} | 87 | trace_vfio_load_state_device_buffer_load_end(vbasedev->name, |
50 | + | 88 | @@ -XXX,XX +XXX,XX @@ static VFIOMultifd *vfio_multifd_new(void) |
51 | +static void vfio_state_buffers_destroy(VFIOStateBuffers *bufs) | 89 | multifd->load_buf_idx = 0; |
52 | +{ | 90 | multifd->load_buf_idx_last = UINT32_MAX; |
53 | + g_clear_pointer(&bufs->array, g_array_unref); | 91 | multifd->load_buf_queued_pending_buffers = 0; |
54 | +} | 92 | + multifd->load_buf_queued_pending_buffers_size = 0; |
55 | + | 93 | qemu_cond_init(&multifd->load_bufs_buffer_ready_cond); |
56 | +static void vfio_state_buffers_assert_init(VFIOStateBuffers *bufs) | 94 | |
57 | +{ | 95 | multifd->load_bufs_iter_done = false; |
58 | + assert(bufs->array); | 96 | diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c |
59 | +} | 97 | index XXXXXXX..XXXXXXX 100644 |
60 | + | 98 | --- a/hw/vfio/pci.c |
61 | +static guint vfio_state_buffers_size_get(VFIOStateBuffers *bufs) | 99 | +++ b/hw/vfio/pci.c |
62 | +{ | 100 | @@ -XXX,XX +XXX,XX @@ static const Property vfio_pci_dev_properties[] = { |
63 | + return bufs->array->len; | 101 | ON_OFF_AUTO_AUTO), |
64 | +} | 102 | DEFINE_PROP_UINT64("x-migration-max-queued-buffers", VFIOPCIDevice, |
65 | + | 103 | vbasedev.migration_max_queued_buffers, UINT64_MAX), |
66 | +static void vfio_state_buffers_size_set(VFIOStateBuffers *bufs, guint size) | 104 | + DEFINE_PROP_SIZE("x-migration-max-queued-buffers-size", VFIOPCIDevice, |
67 | +{ | 105 | + vbasedev.migration_max_queued_buffers_size, UINT64_MAX), |
68 | + g_array_set_size(bufs->array, size); | 106 | DEFINE_PROP_BOOL("migration-events", VFIOPCIDevice, |
69 | +} | 107 | vbasedev.migration_events, false), |
70 | + | 108 | DEFINE_PROP_BOOL("x-no-mmap", VFIOPCIDevice, vbasedev.no_mmap, false), |
71 | +static VFIOStateBuffer *vfio_state_buffers_at(VFIOStateBuffers *bufs, guint idx) | 109 | @@ -XXX,XX +XXX,XX @@ static void vfio_pci_dev_class_init(ObjectClass *klass, void *data) |
72 | +{ | 110 | "destination when doing live " |
73 | + return &g_array_index(bufs->array, VFIOStateBuffer, idx); | 111 | "migration of device state via " |
74 | +} | 112 | "multifd channels"); |
113 | + object_class_property_set_description(klass, /* 10.0 */ | ||
114 | + "x-migration-max-queued-buffers-size", | ||
115 | + "Maximum size of in-flight VFIO " | ||
116 | + "device state buffers queued at the " | ||
117 | + "destination when doing live " | ||
118 | + "migration of device state via " | ||
119 | + "multifd channels"); | ||
120 | } | ||
121 | |||
122 | static const TypeInfo vfio_pci_dev_info = { | ||
123 | diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h | ||
124 | index XXXXXXX..XXXXXXX 100644 | ||
125 | --- a/include/hw/vfio/vfio-common.h | ||
126 | +++ b/include/hw/vfio/vfio-common.h | ||
127 | @@ -XXX,XX +XXX,XX @@ typedef struct VFIODevice { | ||
128 | OnOffAuto migration_multifd_transfer; | ||
129 | OnOffAuto migration_load_config_after_iter; | ||
130 | uint64_t migration_max_queued_buffers; | ||
131 | + uint64_t migration_max_queued_buffers_size; | ||
132 | bool migration_events; | ||
133 | VFIODeviceOps *ops; | ||
134 | unsigned int num_irqs; | diff view generated by jsdifflib |
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | 1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> |
---|---|---|---|
2 | 2 | ||
3 | It's possible for {load,save}_cleanup SaveVMHandlers to get called without | 3 | It's possible for {load,save}_cleanup SaveVMHandlers to get called without |
4 | the corresponding {load,save}_setup handler being called first. | 4 | the corresponding {load,save}_setup handler being called first. |
5 | 5 | ||
6 | One such example is if {load,save}_setup handler of a proceeding device | 6 | One such example is if {load,save}_setup handler of a proceeding device |
7 | returns error. | 7 | returns error. |
8 | In this case the migration core cleanup code will call all corresponding | 8 | In this case the migration core cleanup code will call all corresponding |
9 | cleanup handlers, even for these devices which haven't had its setup | 9 | cleanup handlers, even for these devices which haven't had its setup |
10 | handler called. | 10 | handler called. |
11 | 11 | ||
12 | Since this behavior can generate some surprises let's clearly document it | 12 | Since this behavior can generate some surprises let's clearly document it |
13 | in these SaveVMHandlers description. | 13 | in these SaveVMHandlers description. |
14 | 14 | ||
15 | Reviewed-by: Fabiano Rosas <farosas@suse.de> | 15 | Reviewed-by: Fabiano Rosas <farosas@suse.de> |
16 | Reviewed-by: Cédric Le Goater <clg@redhat.com> | 16 | Reviewed-by: Cédric Le Goater <clg@redhat.com> |
17 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | 17 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> |
18 | --- | 18 | --- |
19 | include/migration/register.h | 6 +++++- | 19 | include/migration/register.h | 6 +++++- |
20 | 1 file changed, 5 insertions(+), 1 deletion(-) | 20 | 1 file changed, 5 insertions(+), 1 deletion(-) |
21 | 21 | ||
22 | diff --git a/include/migration/register.h b/include/migration/register.h | 22 | diff --git a/include/migration/register.h b/include/migration/register.h |
23 | index XXXXXXX..XXXXXXX 100644 | 23 | index XXXXXXX..XXXXXXX 100644 |
24 | --- a/include/migration/register.h | 24 | --- a/include/migration/register.h |
25 | +++ b/include/migration/register.h | 25 | +++ b/include/migration/register.h |
26 | @@ -XXX,XX +XXX,XX @@ typedef struct SaveVMHandlers { | 26 | @@ -XXX,XX +XXX,XX @@ typedef struct SaveVMHandlers { |
27 | /** | 27 | /** |
28 | * @save_cleanup | 28 | * @save_cleanup |
29 | * | 29 | * |
30 | - * Uninitializes the data structures on the source | 30 | - * Uninitializes the data structures on the source |
31 | + * Uninitializes the data structures on the source. | 31 | + * Uninitializes the data structures on the source. |
32 | + * Note that this handler can be called even if save_setup | 32 | + * Note that this handler can be called even if save_setup |
33 | + * wasn't called earlier. | 33 | + * wasn't called earlier. |
34 | * | 34 | * |
35 | * @opaque: data pointer passed to register_savevm_live() | 35 | * @opaque: data pointer passed to register_savevm_live() |
36 | */ | 36 | */ |
37 | @@ -XXX,XX +XXX,XX @@ typedef struct SaveVMHandlers { | 37 | @@ -XXX,XX +XXX,XX @@ typedef struct SaveVMHandlers { |
38 | * @load_cleanup | 38 | * @load_cleanup |
39 | * | 39 | * |
40 | * Uninitializes the data structures on the destination. | 40 | * Uninitializes the data structures on the destination. |
41 | + * Note that this handler can be called even if load_setup | 41 | + * Note that this handler can be called even if load_setup |
42 | + * wasn't called earlier. | 42 | + * wasn't called earlier. |
43 | * | 43 | * |
44 | * @opaque: data pointer passed to register_savevm_live() | 44 | * @opaque: data pointer passed to register_savevm_live() |
45 | * | 45 | * |
46 | 46 | diff view generated by jsdifflib |
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | 1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> |
---|---|---|---|
2 | 2 | ||
3 | This function name conflicts with one used by a future generic thread pool | 3 | This function name conflicts with one used by a future generic thread pool |
4 | function and it was only used by one test anyway. | 4 | function and it was only used by one test anyway. |
5 | 5 | ||
6 | Update the trace event name in thread_pool_submit_aio() accordingly. | 6 | Update the trace event name in thread_pool_submit_aio() accordingly. |
7 | 7 | ||
8 | Acked-by: Fabiano Rosas <farosas@suse.de> | 8 | Acked-by: Fabiano Rosas <farosas@suse.de> |
9 | Reviewed-by: Cédric Le Goater <clg@redhat.com> | 9 | Reviewed-by: Cédric Le Goater <clg@redhat.com> |
10 | Reviewed-by: Peter Xu <peterx@redhat.com> | 10 | Reviewed-by: Peter Xu <peterx@redhat.com> |
11 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | 11 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> |
12 | --- | 12 | --- |
13 | include/block/thread-pool.h | 3 +-- | 13 | include/block/thread-pool.h | 3 +-- |
14 | tests/unit/test-thread-pool.c | 6 +++--- | 14 | tests/unit/test-thread-pool.c | 6 +++--- |
15 | util/thread-pool.c | 7 +------ | 15 | util/thread-pool.c | 7 +------ |
16 | util/trace-events | 2 +- | 16 | util/trace-events | 2 +- |
17 | 4 files changed, 6 insertions(+), 12 deletions(-) | 17 | 4 files changed, 6 insertions(+), 12 deletions(-) |
18 | 18 | ||
19 | diff --git a/include/block/thread-pool.h b/include/block/thread-pool.h | 19 | diff --git a/include/block/thread-pool.h b/include/block/thread-pool.h |
20 | index XXXXXXX..XXXXXXX 100644 | 20 | index XXXXXXX..XXXXXXX 100644 |
21 | --- a/include/block/thread-pool.h | 21 | --- a/include/block/thread-pool.h |
22 | +++ b/include/block/thread-pool.h | 22 | +++ b/include/block/thread-pool.h |
23 | @@ -XXX,XX +XXX,XX @@ ThreadPool *thread_pool_new(struct AioContext *ctx); | 23 | @@ -XXX,XX +XXX,XX @@ ThreadPool *thread_pool_new(struct AioContext *ctx); |
24 | void thread_pool_free(ThreadPool *pool); | 24 | void thread_pool_free(ThreadPool *pool); |
25 | 25 | ||
26 | /* | 26 | /* |
27 | - * thread_pool_submit* API: submit I/O requests in the thread's | 27 | - * thread_pool_submit* API: submit I/O requests in the thread's |
28 | + * thread_pool_submit_{aio,co} API: submit I/O requests in the thread's | 28 | + * thread_pool_submit_{aio,co} API: submit I/O requests in the thread's |
29 | * current AioContext. | 29 | * current AioContext. |
30 | */ | 30 | */ |
31 | BlockAIOCB *thread_pool_submit_aio(ThreadPoolFunc *func, void *arg, | 31 | BlockAIOCB *thread_pool_submit_aio(ThreadPoolFunc *func, void *arg, |
32 | BlockCompletionFunc *cb, void *opaque); | 32 | BlockCompletionFunc *cb, void *opaque); |
33 | int coroutine_fn thread_pool_submit_co(ThreadPoolFunc *func, void *arg); | 33 | int coroutine_fn thread_pool_submit_co(ThreadPoolFunc *func, void *arg); |
34 | -void thread_pool_submit(ThreadPoolFunc *func, void *arg); | 34 | -void thread_pool_submit(ThreadPoolFunc *func, void *arg); |
35 | 35 | ||
36 | void thread_pool_update_params(ThreadPool *pool, struct AioContext *ctx); | 36 | void thread_pool_update_params(ThreadPool *pool, struct AioContext *ctx); |
37 | 37 | ||
38 | diff --git a/tests/unit/test-thread-pool.c b/tests/unit/test-thread-pool.c | 38 | diff --git a/tests/unit/test-thread-pool.c b/tests/unit/test-thread-pool.c |
39 | index XXXXXXX..XXXXXXX 100644 | 39 | index XXXXXXX..XXXXXXX 100644 |
40 | --- a/tests/unit/test-thread-pool.c | 40 | --- a/tests/unit/test-thread-pool.c |
41 | +++ b/tests/unit/test-thread-pool.c | 41 | +++ b/tests/unit/test-thread-pool.c |
42 | @@ -XXX,XX +XXX,XX @@ static void done_cb(void *opaque, int ret) | 42 | @@ -XXX,XX +XXX,XX @@ static void done_cb(void *opaque, int ret) |
43 | active--; | 43 | active--; |
44 | } | 44 | } |
45 | 45 | ||
46 | -static void test_submit(void) | 46 | -static void test_submit(void) |
47 | +static void test_submit_no_complete(void) | 47 | +static void test_submit_no_complete(void) |
48 | { | 48 | { |
49 | WorkerTestData data = { .n = 0 }; | 49 | WorkerTestData data = { .n = 0 }; |
50 | - thread_pool_submit(worker_cb, &data); | 50 | - thread_pool_submit(worker_cb, &data); |
51 | + thread_pool_submit_aio(worker_cb, &data, NULL, NULL); | 51 | + thread_pool_submit_aio(worker_cb, &data, NULL, NULL); |
52 | while (data.n == 0) { | 52 | while (data.n == 0) { |
53 | aio_poll(ctx, true); | 53 | aio_poll(ctx, true); |
54 | } | 54 | } |
55 | @@ -XXX,XX +XXX,XX @@ int main(int argc, char **argv) | 55 | @@ -XXX,XX +XXX,XX @@ int main(int argc, char **argv) |
56 | ctx = qemu_get_current_aio_context(); | 56 | ctx = qemu_get_current_aio_context(); |
57 | 57 | ||
58 | g_test_init(&argc, &argv, NULL); | 58 | g_test_init(&argc, &argv, NULL); |
59 | - g_test_add_func("/thread-pool/submit", test_submit); | 59 | - g_test_add_func("/thread-pool/submit", test_submit); |
60 | + g_test_add_func("/thread-pool/submit-no-complete", test_submit_no_complete); | 60 | + g_test_add_func("/thread-pool/submit-no-complete", test_submit_no_complete); |
61 | g_test_add_func("/thread-pool/submit-aio", test_submit_aio); | 61 | g_test_add_func("/thread-pool/submit-aio", test_submit_aio); |
62 | g_test_add_func("/thread-pool/submit-co", test_submit_co); | 62 | g_test_add_func("/thread-pool/submit-co", test_submit_co); |
63 | g_test_add_func("/thread-pool/submit-many", test_submit_many); | 63 | g_test_add_func("/thread-pool/submit-many", test_submit_many); |
64 | diff --git a/util/thread-pool.c b/util/thread-pool.c | 64 | diff --git a/util/thread-pool.c b/util/thread-pool.c |
65 | index XXXXXXX..XXXXXXX 100644 | 65 | index XXXXXXX..XXXXXXX 100644 |
66 | --- a/util/thread-pool.c | 66 | --- a/util/thread-pool.c |
67 | +++ b/util/thread-pool.c | 67 | +++ b/util/thread-pool.c |
68 | @@ -XXX,XX +XXX,XX @@ BlockAIOCB *thread_pool_submit_aio(ThreadPoolFunc *func, void *arg, | 68 | @@ -XXX,XX +XXX,XX @@ BlockAIOCB *thread_pool_submit_aio(ThreadPoolFunc *func, void *arg, |
69 | 69 | ||
70 | QLIST_INSERT_HEAD(&pool->head, req, all); | 70 | QLIST_INSERT_HEAD(&pool->head, req, all); |
71 | 71 | ||
72 | - trace_thread_pool_submit(pool, req, arg); | 72 | - trace_thread_pool_submit(pool, req, arg); |
73 | + trace_thread_pool_submit_aio(pool, req, arg); | 73 | + trace_thread_pool_submit_aio(pool, req, arg); |
74 | 74 | ||
75 | qemu_mutex_lock(&pool->lock); | 75 | qemu_mutex_lock(&pool->lock); |
76 | if (pool->idle_threads == 0 && pool->cur_threads < pool->max_threads) { | 76 | if (pool->idle_threads == 0 && pool->cur_threads < pool->max_threads) { |
77 | @@ -XXX,XX +XXX,XX @@ int coroutine_fn thread_pool_submit_co(ThreadPoolFunc *func, void *arg) | 77 | @@ -XXX,XX +XXX,XX @@ int coroutine_fn thread_pool_submit_co(ThreadPoolFunc *func, void *arg) |
78 | return tpc.ret; | 78 | return tpc.ret; |
79 | } | 79 | } |
80 | 80 | ||
81 | -void thread_pool_submit(ThreadPoolFunc *func, void *arg) | 81 | -void thread_pool_submit(ThreadPoolFunc *func, void *arg) |
82 | -{ | 82 | -{ |
83 | - thread_pool_submit_aio(func, arg, NULL, NULL); | 83 | - thread_pool_submit_aio(func, arg, NULL, NULL); |
84 | -} | 84 | -} |
85 | - | 85 | - |
86 | void thread_pool_update_params(ThreadPool *pool, AioContext *ctx) | 86 | void thread_pool_update_params(ThreadPool *pool, AioContext *ctx) |
87 | { | 87 | { |
88 | qemu_mutex_lock(&pool->lock); | 88 | qemu_mutex_lock(&pool->lock); |
89 | diff --git a/util/trace-events b/util/trace-events | 89 | diff --git a/util/trace-events b/util/trace-events |
90 | index XXXXXXX..XXXXXXX 100644 | 90 | index XXXXXXX..XXXXXXX 100644 |
91 | --- a/util/trace-events | 91 | --- a/util/trace-events |
92 | +++ b/util/trace-events | 92 | +++ b/util/trace-events |
93 | @@ -XXX,XX +XXX,XX @@ aio_co_schedule_bh_cb(void *ctx, void *co) "ctx %p co %p" | 93 | @@ -XXX,XX +XXX,XX @@ aio_co_schedule_bh_cb(void *ctx, void *co) "ctx %p co %p" |
94 | reentrant_aio(void *ctx, const char *name) "ctx %p name %s" | 94 | reentrant_aio(void *ctx, const char *name) "ctx %p name %s" |
95 | 95 | ||
96 | # thread-pool.c | 96 | # thread-pool.c |
97 | -thread_pool_submit(void *pool, void *req, void *opaque) "pool %p req %p opaque %p" | 97 | -thread_pool_submit(void *pool, void *req, void *opaque) "pool %p req %p opaque %p" |
98 | +thread_pool_submit_aio(void *pool, void *req, void *opaque) "pool %p req %p opaque %p" | 98 | +thread_pool_submit_aio(void *pool, void *req, void *opaque) "pool %p req %p opaque %p" |
99 | thread_pool_complete(void *pool, void *req, void *opaque, int ret) "pool %p req %p opaque %p ret %d" | 99 | thread_pool_complete(void *pool, void *req, void *opaque, int ret) "pool %p req %p opaque %p ret %d" |
100 | thread_pool_cancel(void *req, void *opaque) "req %p opaque %p" | 100 | thread_pool_cancel(void *req, void *opaque) "req %p opaque %p" |
101 | 101 | ||
102 | 102 | diff view generated by jsdifflib |
New patch | |||
---|---|---|---|
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | ||
1 | 2 | ||
3 | Wire data commonly use BE byte order (including in the existing migration | ||
4 | protocol), use it also for for VFIO device state packets. | ||
5 | |||
6 | Fixes: 3228d311ab18 ("vfio/migration: Multifd device state transfer support - received buffers queuing") | ||
7 | Fixes: 6d644baef203 ("vfio/migration: Multifd device state transfer support - send side") | ||
8 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | ||
9 | --- | ||
10 | hw/vfio/migration-multifd.c | 15 ++++++++++----- | ||
11 | 1 file changed, 10 insertions(+), 5 deletions(-) | ||
12 | |||
13 | diff --git a/hw/vfio/migration-multifd.c b/hw/vfio/migration-multifd.c | ||
14 | index XXXXXXX..XXXXXXX 100644 | ||
15 | --- a/hw/vfio/migration-multifd.c | ||
16 | +++ b/hw/vfio/migration-multifd.c | ||
17 | @@ -XXX,XX +XXX,XX @@ | ||
18 | #include "hw/vfio/vfio-common.h" | ||
19 | #include "migration/misc.h" | ||
20 | #include "qapi/error.h" | ||
21 | +#include "qemu/bswap.h" | ||
22 | #include "qemu/error-report.h" | ||
23 | #include "qemu/lockable.h" | ||
24 | #include "qemu/main-loop.h" | ||
25 | @@ -XXX,XX +XXX,XX @@ bool vfio_multifd_load_state_buffer(void *opaque, char *data, size_t data_size, | ||
26 | return false; | ||
27 | } | ||
28 | |||
29 | + packet->version = be32_to_cpu(packet->version); | ||
30 | if (packet->version != VFIO_DEVICE_STATE_PACKET_VER_CURRENT) { | ||
31 | error_setg(errp, "%s: packet has unknown version %" PRIu32, | ||
32 | vbasedev->name, packet->version); | ||
33 | return false; | ||
34 | } | ||
35 | |||
36 | + packet->idx = be32_to_cpu(packet->idx); | ||
37 | + packet->flags = be32_to_cpu(packet->flags); | ||
38 | + | ||
39 | if (packet->idx == UINT32_MAX) { | ||
40 | error_setg(errp, "%s: packet index is invalid", vbasedev->name); | ||
41 | return false; | ||
42 | @@ -XXX,XX +XXX,XX @@ vfio_save_complete_precopy_thread_config_state(VFIODevice *vbasedev, | ||
43 | |||
44 | packet_len = sizeof(*packet) + bioc->usage; | ||
45 | packet = g_malloc0(packet_len); | ||
46 | - packet->version = VFIO_DEVICE_STATE_PACKET_VER_CURRENT; | ||
47 | - packet->idx = idx; | ||
48 | - packet->flags = VFIO_DEVICE_STATE_CONFIG_STATE; | ||
49 | + packet->version = cpu_to_be32(VFIO_DEVICE_STATE_PACKET_VER_CURRENT); | ||
50 | + packet->idx = cpu_to_be32(idx); | ||
51 | + packet->flags = cpu_to_be32(VFIO_DEVICE_STATE_CONFIG_STATE); | ||
52 | memcpy(&packet->data, bioc->data, bioc->usage); | ||
53 | |||
54 | if (!multifd_queue_device_state(idstr, instance_id, | ||
55 | @@ -XXX,XX +XXX,XX @@ vfio_multifd_save_complete_precopy_thread(SaveLiveCompletePrecopyThreadData *d, | ||
56 | } | ||
57 | |||
58 | packet = g_malloc0(sizeof(*packet) + migration->data_buffer_size); | ||
59 | - packet->version = VFIO_DEVICE_STATE_PACKET_VER_CURRENT; | ||
60 | + packet->version = cpu_to_be32(VFIO_DEVICE_STATE_PACKET_VER_CURRENT); | ||
61 | |||
62 | for (idx = 0; ; idx++) { | ||
63 | ssize_t data_size; | ||
64 | @@ -XXX,XX +XXX,XX @@ vfio_multifd_save_complete_precopy_thread(SaveLiveCompletePrecopyThreadData *d, | ||
65 | break; | ||
66 | } | ||
67 | |||
68 | - packet->idx = idx; | ||
69 | + packet->idx = cpu_to_be32(idx); | ||
70 | packet_size = sizeof(*packet) + data_size; | ||
71 | |||
72 | if (!multifd_queue_device_state(d->idstr, d->instance_id, | diff view generated by jsdifflib |
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | 1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> |
---|---|---|---|
2 | 2 | ||
3 | These names conflict with ones used by future generic thread pool | 3 | These names conflict with ones used by future generic thread pool |
4 | equivalents. | 4 | equivalents. |
5 | Generic names should belong to the generic pool type, not specific (AIO) | 5 | Generic names should belong to the generic pool type, not specific (AIO) |
6 | type. | 6 | type. |
7 | 7 | ||
8 | Acked-by: Fabiano Rosas <farosas@suse.de> | 8 | Acked-by: Fabiano Rosas <farosas@suse.de> |
9 | Reviewed-by: Cédric Le Goater <clg@redhat.com> | 9 | Reviewed-by: Cédric Le Goater <clg@redhat.com> |
10 | Reviewed-by: Peter Xu <peterx@redhat.com> | 10 | Reviewed-by: Peter Xu <peterx@redhat.com> |
11 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | 11 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> |
12 | --- | 12 | --- |
13 | include/block/aio.h | 8 ++--- | 13 | include/block/aio.h | 8 ++--- |
14 | include/block/thread-pool.h | 8 ++--- | 14 | include/block/thread-pool.h | 8 ++--- |
15 | util/async.c | 6 ++-- | 15 | util/async.c | 6 ++-- |
16 | util/thread-pool.c | 58 ++++++++++++++++++------------------- | 16 | util/thread-pool.c | 58 ++++++++++++++++++------------------- |
17 | util/trace-events | 4 +-- | 17 | util/trace-events | 4 +-- |
18 | 5 files changed, 42 insertions(+), 42 deletions(-) | 18 | 5 files changed, 42 insertions(+), 42 deletions(-) |
19 | 19 | ||
20 | diff --git a/include/block/aio.h b/include/block/aio.h | 20 | diff --git a/include/block/aio.h b/include/block/aio.h |
21 | index XXXXXXX..XXXXXXX 100644 | 21 | index XXXXXXX..XXXXXXX 100644 |
22 | --- a/include/block/aio.h | 22 | --- a/include/block/aio.h |
23 | +++ b/include/block/aio.h | 23 | +++ b/include/block/aio.h |
24 | @@ -XXX,XX +XXX,XX @@ typedef void QEMUBHFunc(void *opaque); | 24 | @@ -XXX,XX +XXX,XX @@ typedef void QEMUBHFunc(void *opaque); |
25 | typedef bool AioPollFn(void *opaque); | 25 | typedef bool AioPollFn(void *opaque); |
26 | typedef void IOHandler(void *opaque); | 26 | typedef void IOHandler(void *opaque); |
27 | 27 | ||
28 | -struct ThreadPool; | 28 | -struct ThreadPool; |
29 | +struct ThreadPoolAio; | 29 | +struct ThreadPoolAio; |
30 | struct LinuxAioState; | 30 | struct LinuxAioState; |
31 | typedef struct LuringState LuringState; | 31 | typedef struct LuringState LuringState; |
32 | 32 | ||
33 | @@ -XXX,XX +XXX,XX @@ struct AioContext { | 33 | @@ -XXX,XX +XXX,XX @@ struct AioContext { |
34 | /* Thread pool for performing work and receiving completion callbacks. | 34 | /* Thread pool for performing work and receiving completion callbacks. |
35 | * Has its own locking. | 35 | * Has its own locking. |
36 | */ | 36 | */ |
37 | - struct ThreadPool *thread_pool; | 37 | - struct ThreadPool *thread_pool; |
38 | + struct ThreadPoolAio *thread_pool; | 38 | + struct ThreadPoolAio *thread_pool; |
39 | 39 | ||
40 | #ifdef CONFIG_LINUX_AIO | 40 | #ifdef CONFIG_LINUX_AIO |
41 | struct LinuxAioState *linux_aio; | 41 | struct LinuxAioState *linux_aio; |
42 | @@ -XXX,XX +XXX,XX @@ void aio_set_event_notifier_poll(AioContext *ctx, | 42 | @@ -XXX,XX +XXX,XX @@ void aio_set_event_notifier_poll(AioContext *ctx, |
43 | */ | 43 | */ |
44 | GSource *aio_get_g_source(AioContext *ctx); | 44 | GSource *aio_get_g_source(AioContext *ctx); |
45 | 45 | ||
46 | -/* Return the ThreadPool bound to this AioContext */ | 46 | -/* Return the ThreadPool bound to this AioContext */ |
47 | -struct ThreadPool *aio_get_thread_pool(AioContext *ctx); | 47 | -struct ThreadPool *aio_get_thread_pool(AioContext *ctx); |
48 | +/* Return the ThreadPoolAio bound to this AioContext */ | 48 | +/* Return the ThreadPoolAio bound to this AioContext */ |
49 | +struct ThreadPoolAio *aio_get_thread_pool(AioContext *ctx); | 49 | +struct ThreadPoolAio *aio_get_thread_pool(AioContext *ctx); |
50 | 50 | ||
51 | /* Setup the LinuxAioState bound to this AioContext */ | 51 | /* Setup the LinuxAioState bound to this AioContext */ |
52 | struct LinuxAioState *aio_setup_linux_aio(AioContext *ctx, Error **errp); | 52 | struct LinuxAioState *aio_setup_linux_aio(AioContext *ctx, Error **errp); |
53 | diff --git a/include/block/thread-pool.h b/include/block/thread-pool.h | 53 | diff --git a/include/block/thread-pool.h b/include/block/thread-pool.h |
54 | index XXXXXXX..XXXXXXX 100644 | 54 | index XXXXXXX..XXXXXXX 100644 |
55 | --- a/include/block/thread-pool.h | 55 | --- a/include/block/thread-pool.h |
56 | +++ b/include/block/thread-pool.h | 56 | +++ b/include/block/thread-pool.h |
57 | @@ -XXX,XX +XXX,XX @@ | 57 | @@ -XXX,XX +XXX,XX @@ |
58 | 58 | ||
59 | typedef int ThreadPoolFunc(void *opaque); | 59 | typedef int ThreadPoolFunc(void *opaque); |
60 | 60 | ||
61 | -typedef struct ThreadPool ThreadPool; | 61 | -typedef struct ThreadPool ThreadPool; |
62 | +typedef struct ThreadPoolAio ThreadPoolAio; | 62 | +typedef struct ThreadPoolAio ThreadPoolAio; |
63 | 63 | ||
64 | -ThreadPool *thread_pool_new(struct AioContext *ctx); | 64 | -ThreadPool *thread_pool_new(struct AioContext *ctx); |
65 | -void thread_pool_free(ThreadPool *pool); | 65 | -void thread_pool_free(ThreadPool *pool); |
66 | +ThreadPoolAio *thread_pool_new_aio(struct AioContext *ctx); | 66 | +ThreadPoolAio *thread_pool_new_aio(struct AioContext *ctx); |
67 | +void thread_pool_free_aio(ThreadPoolAio *pool); | 67 | +void thread_pool_free_aio(ThreadPoolAio *pool); |
68 | 68 | ||
69 | /* | 69 | /* |
70 | * thread_pool_submit_{aio,co} API: submit I/O requests in the thread's | 70 | * thread_pool_submit_{aio,co} API: submit I/O requests in the thread's |
71 | @@ -XXX,XX +XXX,XX @@ void thread_pool_free(ThreadPool *pool); | 71 | @@ -XXX,XX +XXX,XX @@ void thread_pool_free(ThreadPool *pool); |
72 | BlockAIOCB *thread_pool_submit_aio(ThreadPoolFunc *func, void *arg, | 72 | BlockAIOCB *thread_pool_submit_aio(ThreadPoolFunc *func, void *arg, |
73 | BlockCompletionFunc *cb, void *opaque); | 73 | BlockCompletionFunc *cb, void *opaque); |
74 | int coroutine_fn thread_pool_submit_co(ThreadPoolFunc *func, void *arg); | 74 | int coroutine_fn thread_pool_submit_co(ThreadPoolFunc *func, void *arg); |
75 | +void thread_pool_update_params(ThreadPoolAio *pool, struct AioContext *ctx); | 75 | +void thread_pool_update_params(ThreadPoolAio *pool, struct AioContext *ctx); |
76 | 76 | ||
77 | -void thread_pool_update_params(ThreadPool *pool, struct AioContext *ctx); | 77 | -void thread_pool_update_params(ThreadPool *pool, struct AioContext *ctx); |
78 | 78 | ||
79 | #endif | 79 | #endif |
80 | diff --git a/util/async.c b/util/async.c | 80 | diff --git a/util/async.c b/util/async.c |
81 | index XXXXXXX..XXXXXXX 100644 | 81 | index XXXXXXX..XXXXXXX 100644 |
82 | --- a/util/async.c | 82 | --- a/util/async.c |
83 | +++ b/util/async.c | 83 | +++ b/util/async.c |
84 | @@ -XXX,XX +XXX,XX @@ aio_ctx_finalize(GSource *source) | 84 | @@ -XXX,XX +XXX,XX @@ aio_ctx_finalize(GSource *source) |
85 | QEMUBH *bh; | 85 | QEMUBH *bh; |
86 | unsigned flags; | 86 | unsigned flags; |
87 | 87 | ||
88 | - thread_pool_free(ctx->thread_pool); | 88 | - thread_pool_free(ctx->thread_pool); |
89 | + thread_pool_free_aio(ctx->thread_pool); | 89 | + thread_pool_free_aio(ctx->thread_pool); |
90 | 90 | ||
91 | #ifdef CONFIG_LINUX_AIO | 91 | #ifdef CONFIG_LINUX_AIO |
92 | if (ctx->linux_aio) { | 92 | if (ctx->linux_aio) { |
93 | @@ -XXX,XX +XXX,XX @@ GSource *aio_get_g_source(AioContext *ctx) | 93 | @@ -XXX,XX +XXX,XX @@ GSource *aio_get_g_source(AioContext *ctx) |
94 | return &ctx->source; | 94 | return &ctx->source; |
95 | } | 95 | } |
96 | 96 | ||
97 | -ThreadPool *aio_get_thread_pool(AioContext *ctx) | 97 | -ThreadPool *aio_get_thread_pool(AioContext *ctx) |
98 | +ThreadPoolAio *aio_get_thread_pool(AioContext *ctx) | 98 | +ThreadPoolAio *aio_get_thread_pool(AioContext *ctx) |
99 | { | 99 | { |
100 | if (!ctx->thread_pool) { | 100 | if (!ctx->thread_pool) { |
101 | - ctx->thread_pool = thread_pool_new(ctx); | 101 | - ctx->thread_pool = thread_pool_new(ctx); |
102 | + ctx->thread_pool = thread_pool_new_aio(ctx); | 102 | + ctx->thread_pool = thread_pool_new_aio(ctx); |
103 | } | 103 | } |
104 | return ctx->thread_pool; | 104 | return ctx->thread_pool; |
105 | } | 105 | } |
106 | diff --git a/util/thread-pool.c b/util/thread-pool.c | 106 | diff --git a/util/thread-pool.c b/util/thread-pool.c |
107 | index XXXXXXX..XXXXXXX 100644 | 107 | index XXXXXXX..XXXXXXX 100644 |
108 | --- a/util/thread-pool.c | 108 | --- a/util/thread-pool.c |
109 | +++ b/util/thread-pool.c | 109 | +++ b/util/thread-pool.c |
110 | @@ -XXX,XX +XXX,XX @@ | 110 | @@ -XXX,XX +XXX,XX @@ |
111 | #include "block/thread-pool.h" | 111 | #include "block/thread-pool.h" |
112 | #include "qemu/main-loop.h" | 112 | #include "qemu/main-loop.h" |
113 | 113 | ||
114 | -static void do_spawn_thread(ThreadPool *pool); | 114 | -static void do_spawn_thread(ThreadPool *pool); |
115 | +static void do_spawn_thread(ThreadPoolAio *pool); | 115 | +static void do_spawn_thread(ThreadPoolAio *pool); |
116 | 116 | ||
117 | -typedef struct ThreadPoolElement ThreadPoolElement; | 117 | -typedef struct ThreadPoolElement ThreadPoolElement; |
118 | +typedef struct ThreadPoolElementAio ThreadPoolElementAio; | 118 | +typedef struct ThreadPoolElementAio ThreadPoolElementAio; |
119 | 119 | ||
120 | enum ThreadState { | 120 | enum ThreadState { |
121 | THREAD_QUEUED, | 121 | THREAD_QUEUED, |
122 | @@ -XXX,XX +XXX,XX @@ enum ThreadState { | 122 | @@ -XXX,XX +XXX,XX @@ enum ThreadState { |
123 | THREAD_DONE, | 123 | THREAD_DONE, |
124 | }; | 124 | }; |
125 | 125 | ||
126 | -struct ThreadPoolElement { | 126 | -struct ThreadPoolElement { |
127 | +struct ThreadPoolElementAio { | 127 | +struct ThreadPoolElementAio { |
128 | BlockAIOCB common; | 128 | BlockAIOCB common; |
129 | - ThreadPool *pool; | 129 | - ThreadPool *pool; |
130 | + ThreadPoolAio *pool; | 130 | + ThreadPoolAio *pool; |
131 | ThreadPoolFunc *func; | 131 | ThreadPoolFunc *func; |
132 | void *arg; | 132 | void *arg; |
133 | 133 | ||
134 | @@ -XXX,XX +XXX,XX @@ struct ThreadPoolElement { | 134 | @@ -XXX,XX +XXX,XX @@ struct ThreadPoolElement { |
135 | int ret; | 135 | int ret; |
136 | 136 | ||
137 | /* Access to this list is protected by lock. */ | 137 | /* Access to this list is protected by lock. */ |
138 | - QTAILQ_ENTRY(ThreadPoolElement) reqs; | 138 | - QTAILQ_ENTRY(ThreadPoolElement) reqs; |
139 | + QTAILQ_ENTRY(ThreadPoolElementAio) reqs; | 139 | + QTAILQ_ENTRY(ThreadPoolElementAio) reqs; |
140 | 140 | ||
141 | /* This list is only written by the thread pool's mother thread. */ | 141 | /* This list is only written by the thread pool's mother thread. */ |
142 | - QLIST_ENTRY(ThreadPoolElement) all; | 142 | - QLIST_ENTRY(ThreadPoolElement) all; |
143 | + QLIST_ENTRY(ThreadPoolElementAio) all; | 143 | + QLIST_ENTRY(ThreadPoolElementAio) all; |
144 | }; | 144 | }; |
145 | 145 | ||
146 | -struct ThreadPool { | 146 | -struct ThreadPool { |
147 | +struct ThreadPoolAio { | 147 | +struct ThreadPoolAio { |
148 | AioContext *ctx; | 148 | AioContext *ctx; |
149 | QEMUBH *completion_bh; | 149 | QEMUBH *completion_bh; |
150 | QemuMutex lock; | 150 | QemuMutex lock; |
151 | @@ -XXX,XX +XXX,XX @@ struct ThreadPool { | 151 | @@ -XXX,XX +XXX,XX @@ struct ThreadPool { |
152 | QEMUBH *new_thread_bh; | 152 | QEMUBH *new_thread_bh; |
153 | 153 | ||
154 | /* The following variables are only accessed from one AioContext. */ | 154 | /* The following variables are only accessed from one AioContext. */ |
155 | - QLIST_HEAD(, ThreadPoolElement) head; | 155 | - QLIST_HEAD(, ThreadPoolElement) head; |
156 | + QLIST_HEAD(, ThreadPoolElementAio) head; | 156 | + QLIST_HEAD(, ThreadPoolElementAio) head; |
157 | 157 | ||
158 | /* The following variables are protected by lock. */ | 158 | /* The following variables are protected by lock. */ |
159 | - QTAILQ_HEAD(, ThreadPoolElement) request_list; | 159 | - QTAILQ_HEAD(, ThreadPoolElement) request_list; |
160 | + QTAILQ_HEAD(, ThreadPoolElementAio) request_list; | 160 | + QTAILQ_HEAD(, ThreadPoolElementAio) request_list; |
161 | int cur_threads; | 161 | int cur_threads; |
162 | int idle_threads; | 162 | int idle_threads; |
163 | int new_threads; /* backlog of threads we need to create */ | 163 | int new_threads; /* backlog of threads we need to create */ |
164 | @@ -XXX,XX +XXX,XX @@ struct ThreadPool { | 164 | @@ -XXX,XX +XXX,XX @@ struct ThreadPool { |
165 | 165 | ||
166 | static void *worker_thread(void *opaque) | 166 | static void *worker_thread(void *opaque) |
167 | { | 167 | { |
168 | - ThreadPool *pool = opaque; | 168 | - ThreadPool *pool = opaque; |
169 | + ThreadPoolAio *pool = opaque; | 169 | + ThreadPoolAio *pool = opaque; |
170 | 170 | ||
171 | qemu_mutex_lock(&pool->lock); | 171 | qemu_mutex_lock(&pool->lock); |
172 | pool->pending_threads--; | 172 | pool->pending_threads--; |
173 | do_spawn_thread(pool); | 173 | do_spawn_thread(pool); |
174 | 174 | ||
175 | while (pool->cur_threads <= pool->max_threads) { | 175 | while (pool->cur_threads <= pool->max_threads) { |
176 | - ThreadPoolElement *req; | 176 | - ThreadPoolElement *req; |
177 | + ThreadPoolElementAio *req; | 177 | + ThreadPoolElementAio *req; |
178 | int ret; | 178 | int ret; |
179 | 179 | ||
180 | if (QTAILQ_EMPTY(&pool->request_list)) { | 180 | if (QTAILQ_EMPTY(&pool->request_list)) { |
181 | @@ -XXX,XX +XXX,XX @@ static void *worker_thread(void *opaque) | 181 | @@ -XXX,XX +XXX,XX @@ static void *worker_thread(void *opaque) |
182 | return NULL; | 182 | return NULL; |
183 | } | 183 | } |
184 | 184 | ||
185 | -static void do_spawn_thread(ThreadPool *pool) | 185 | -static void do_spawn_thread(ThreadPool *pool) |
186 | +static void do_spawn_thread(ThreadPoolAio *pool) | 186 | +static void do_spawn_thread(ThreadPoolAio *pool) |
187 | { | 187 | { |
188 | QemuThread t; | 188 | QemuThread t; |
189 | 189 | ||
190 | @@ -XXX,XX +XXX,XX @@ static void do_spawn_thread(ThreadPool *pool) | 190 | @@ -XXX,XX +XXX,XX @@ static void do_spawn_thread(ThreadPool *pool) |
191 | 191 | ||
192 | static void spawn_thread_bh_fn(void *opaque) | 192 | static void spawn_thread_bh_fn(void *opaque) |
193 | { | 193 | { |
194 | - ThreadPool *pool = opaque; | 194 | - ThreadPool *pool = opaque; |
195 | + ThreadPoolAio *pool = opaque; | 195 | + ThreadPoolAio *pool = opaque; |
196 | 196 | ||
197 | qemu_mutex_lock(&pool->lock); | 197 | qemu_mutex_lock(&pool->lock); |
198 | do_spawn_thread(pool); | 198 | do_spawn_thread(pool); |
199 | qemu_mutex_unlock(&pool->lock); | 199 | qemu_mutex_unlock(&pool->lock); |
200 | } | 200 | } |
201 | 201 | ||
202 | -static void spawn_thread(ThreadPool *pool) | 202 | -static void spawn_thread(ThreadPool *pool) |
203 | +static void spawn_thread(ThreadPoolAio *pool) | 203 | +static void spawn_thread(ThreadPoolAio *pool) |
204 | { | 204 | { |
205 | pool->cur_threads++; | 205 | pool->cur_threads++; |
206 | pool->new_threads++; | 206 | pool->new_threads++; |
207 | @@ -XXX,XX +XXX,XX @@ static void spawn_thread(ThreadPool *pool) | 207 | @@ -XXX,XX +XXX,XX @@ static void spawn_thread(ThreadPool *pool) |
208 | 208 | ||
209 | static void thread_pool_completion_bh(void *opaque) | 209 | static void thread_pool_completion_bh(void *opaque) |
210 | { | 210 | { |
211 | - ThreadPool *pool = opaque; | 211 | - ThreadPool *pool = opaque; |
212 | - ThreadPoolElement *elem, *next; | 212 | - ThreadPoolElement *elem, *next; |
213 | + ThreadPoolAio *pool = opaque; | 213 | + ThreadPoolAio *pool = opaque; |
214 | + ThreadPoolElementAio *elem, *next; | 214 | + ThreadPoolElementAio *elem, *next; |
215 | 215 | ||
216 | defer_call_begin(); /* cb() may use defer_call() to coalesce work */ | 216 | defer_call_begin(); /* cb() may use defer_call() to coalesce work */ |
217 | 217 | ||
218 | @@ -XXX,XX +XXX,XX @@ restart: | 218 | @@ -XXX,XX +XXX,XX @@ restart: |
219 | continue; | 219 | continue; |
220 | } | 220 | } |
221 | 221 | ||
222 | - trace_thread_pool_complete(pool, elem, elem->common.opaque, | 222 | - trace_thread_pool_complete(pool, elem, elem->common.opaque, |
223 | - elem->ret); | 223 | - elem->ret); |
224 | + trace_thread_pool_complete_aio(pool, elem, elem->common.opaque, | 224 | + trace_thread_pool_complete_aio(pool, elem, elem->common.opaque, |
225 | + elem->ret); | 225 | + elem->ret); |
226 | QLIST_REMOVE(elem, all); | 226 | QLIST_REMOVE(elem, all); |
227 | 227 | ||
228 | if (elem->common.cb) { | 228 | if (elem->common.cb) { |
229 | @@ -XXX,XX +XXX,XX @@ restart: | 229 | @@ -XXX,XX +XXX,XX @@ restart: |
230 | 230 | ||
231 | static void thread_pool_cancel(BlockAIOCB *acb) | 231 | static void thread_pool_cancel(BlockAIOCB *acb) |
232 | { | 232 | { |
233 | - ThreadPoolElement *elem = (ThreadPoolElement *)acb; | 233 | - ThreadPoolElement *elem = (ThreadPoolElement *)acb; |
234 | - ThreadPool *pool = elem->pool; | 234 | - ThreadPool *pool = elem->pool; |
235 | + ThreadPoolElementAio *elem = (ThreadPoolElementAio *)acb; | 235 | + ThreadPoolElementAio *elem = (ThreadPoolElementAio *)acb; |
236 | + ThreadPoolAio *pool = elem->pool; | 236 | + ThreadPoolAio *pool = elem->pool; |
237 | 237 | ||
238 | - trace_thread_pool_cancel(elem, elem->common.opaque); | 238 | - trace_thread_pool_cancel(elem, elem->common.opaque); |
239 | + trace_thread_pool_cancel_aio(elem, elem->common.opaque); | 239 | + trace_thread_pool_cancel_aio(elem, elem->common.opaque); |
240 | 240 | ||
241 | QEMU_LOCK_GUARD(&pool->lock); | 241 | QEMU_LOCK_GUARD(&pool->lock); |
242 | if (elem->state == THREAD_QUEUED) { | 242 | if (elem->state == THREAD_QUEUED) { |
243 | @@ -XXX,XX +XXX,XX @@ static void thread_pool_cancel(BlockAIOCB *acb) | 243 | @@ -XXX,XX +XXX,XX @@ static void thread_pool_cancel(BlockAIOCB *acb) |
244 | } | 244 | } |
245 | 245 | ||
246 | static const AIOCBInfo thread_pool_aiocb_info = { | 246 | static const AIOCBInfo thread_pool_aiocb_info = { |
247 | - .aiocb_size = sizeof(ThreadPoolElement), | 247 | - .aiocb_size = sizeof(ThreadPoolElement), |
248 | + .aiocb_size = sizeof(ThreadPoolElementAio), | 248 | + .aiocb_size = sizeof(ThreadPoolElementAio), |
249 | .cancel_async = thread_pool_cancel, | 249 | .cancel_async = thread_pool_cancel, |
250 | }; | 250 | }; |
251 | 251 | ||
252 | BlockAIOCB *thread_pool_submit_aio(ThreadPoolFunc *func, void *arg, | 252 | BlockAIOCB *thread_pool_submit_aio(ThreadPoolFunc *func, void *arg, |
253 | BlockCompletionFunc *cb, void *opaque) | 253 | BlockCompletionFunc *cb, void *opaque) |
254 | { | 254 | { |
255 | - ThreadPoolElement *req; | 255 | - ThreadPoolElement *req; |
256 | + ThreadPoolElementAio *req; | 256 | + ThreadPoolElementAio *req; |
257 | AioContext *ctx = qemu_get_current_aio_context(); | 257 | AioContext *ctx = qemu_get_current_aio_context(); |
258 | - ThreadPool *pool = aio_get_thread_pool(ctx); | 258 | - ThreadPool *pool = aio_get_thread_pool(ctx); |
259 | + ThreadPoolAio *pool = aio_get_thread_pool(ctx); | 259 | + ThreadPoolAio *pool = aio_get_thread_pool(ctx); |
260 | 260 | ||
261 | /* Assert that the thread submitting work is the same running the pool */ | 261 | /* Assert that the thread submitting work is the same running the pool */ |
262 | assert(pool->ctx == qemu_get_current_aio_context()); | 262 | assert(pool->ctx == qemu_get_current_aio_context()); |
263 | @@ -XXX,XX +XXX,XX @@ int coroutine_fn thread_pool_submit_co(ThreadPoolFunc *func, void *arg) | 263 | @@ -XXX,XX +XXX,XX @@ int coroutine_fn thread_pool_submit_co(ThreadPoolFunc *func, void *arg) |
264 | return tpc.ret; | 264 | return tpc.ret; |
265 | } | 265 | } |
266 | 266 | ||
267 | -void thread_pool_update_params(ThreadPool *pool, AioContext *ctx) | 267 | -void thread_pool_update_params(ThreadPool *pool, AioContext *ctx) |
268 | +void thread_pool_update_params(ThreadPoolAio *pool, AioContext *ctx) | 268 | +void thread_pool_update_params(ThreadPoolAio *pool, AioContext *ctx) |
269 | { | 269 | { |
270 | qemu_mutex_lock(&pool->lock); | 270 | qemu_mutex_lock(&pool->lock); |
271 | 271 | ||
272 | @@ -XXX,XX +XXX,XX @@ void thread_pool_update_params(ThreadPool *pool, AioContext *ctx) | 272 | @@ -XXX,XX +XXX,XX @@ void thread_pool_update_params(ThreadPool *pool, AioContext *ctx) |
273 | qemu_mutex_unlock(&pool->lock); | 273 | qemu_mutex_unlock(&pool->lock); |
274 | } | 274 | } |
275 | 275 | ||
276 | -static void thread_pool_init_one(ThreadPool *pool, AioContext *ctx) | 276 | -static void thread_pool_init_one(ThreadPool *pool, AioContext *ctx) |
277 | +static void thread_pool_init_one(ThreadPoolAio *pool, AioContext *ctx) | 277 | +static void thread_pool_init_one(ThreadPoolAio *pool, AioContext *ctx) |
278 | { | 278 | { |
279 | if (!ctx) { | 279 | if (!ctx) { |
280 | ctx = qemu_get_aio_context(); | 280 | ctx = qemu_get_aio_context(); |
281 | @@ -XXX,XX +XXX,XX @@ static void thread_pool_init_one(ThreadPool *pool, AioContext *ctx) | 281 | @@ -XXX,XX +XXX,XX @@ static void thread_pool_init_one(ThreadPool *pool, AioContext *ctx) |
282 | thread_pool_update_params(pool, ctx); | 282 | thread_pool_update_params(pool, ctx); |
283 | } | 283 | } |
284 | 284 | ||
285 | -ThreadPool *thread_pool_new(AioContext *ctx) | 285 | -ThreadPool *thread_pool_new(AioContext *ctx) |
286 | +ThreadPoolAio *thread_pool_new_aio(AioContext *ctx) | 286 | +ThreadPoolAio *thread_pool_new_aio(AioContext *ctx) |
287 | { | 287 | { |
288 | - ThreadPool *pool = g_new(ThreadPool, 1); | 288 | - ThreadPool *pool = g_new(ThreadPool, 1); |
289 | + ThreadPoolAio *pool = g_new(ThreadPoolAio, 1); | 289 | + ThreadPoolAio *pool = g_new(ThreadPoolAio, 1); |
290 | thread_pool_init_one(pool, ctx); | 290 | thread_pool_init_one(pool, ctx); |
291 | return pool; | 291 | return pool; |
292 | } | 292 | } |
293 | 293 | ||
294 | -void thread_pool_free(ThreadPool *pool) | 294 | -void thread_pool_free(ThreadPool *pool) |
295 | +void thread_pool_free_aio(ThreadPoolAio *pool) | 295 | +void thread_pool_free_aio(ThreadPoolAio *pool) |
296 | { | 296 | { |
297 | if (!pool) { | 297 | if (!pool) { |
298 | return; | 298 | return; |
299 | diff --git a/util/trace-events b/util/trace-events | 299 | diff --git a/util/trace-events b/util/trace-events |
300 | index XXXXXXX..XXXXXXX 100644 | 300 | index XXXXXXX..XXXXXXX 100644 |
301 | --- a/util/trace-events | 301 | --- a/util/trace-events |
302 | +++ b/util/trace-events | 302 | +++ b/util/trace-events |
303 | @@ -XXX,XX +XXX,XX @@ reentrant_aio(void *ctx, const char *name) "ctx %p name %s" | 303 | @@ -XXX,XX +XXX,XX @@ reentrant_aio(void *ctx, const char *name) "ctx %p name %s" |
304 | 304 | ||
305 | # thread-pool.c | 305 | # thread-pool.c |
306 | thread_pool_submit_aio(void *pool, void *req, void *opaque) "pool %p req %p opaque %p" | 306 | thread_pool_submit_aio(void *pool, void *req, void *opaque) "pool %p req %p opaque %p" |
307 | -thread_pool_complete(void *pool, void *req, void *opaque, int ret) "pool %p req %p opaque %p ret %d" | 307 | -thread_pool_complete(void *pool, void *req, void *opaque, int ret) "pool %p req %p opaque %p ret %d" |
308 | -thread_pool_cancel(void *req, void *opaque) "req %p opaque %p" | 308 | -thread_pool_cancel(void *req, void *opaque) "req %p opaque %p" |
309 | +thread_pool_complete_aio(void *pool, void *req, void *opaque, int ret) "pool %p req %p opaque %p ret %d" | 309 | +thread_pool_complete_aio(void *pool, void *req, void *opaque, int ret) "pool %p req %p opaque %p ret %d" |
310 | +thread_pool_cancel_aio(void *req, void *opaque) "req %p opaque %p" | 310 | +thread_pool_cancel_aio(void *req, void *opaque) "req %p opaque %p" |
311 | 311 | ||
312 | # buffer.c | 312 | # buffer.c |
313 | buffer_resize(const char *buf, size_t olen, size_t len) "%s: old %zd, new %zd" | 313 | buffer_resize(const char *buf, size_t olen, size_t len) "%s: old %zd, new %zd" |
314 | 314 | diff view generated by jsdifflib |
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | 1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> |
---|---|---|---|
2 | 2 | ||
3 | Migration code wants to manage device data sending threads in one place. | 3 | Migration code wants to manage device data sending threads in one place. |
4 | 4 | ||
5 | QEMU has an existing thread pool implementation, however it is limited | 5 | QEMU has an existing thread pool implementation, however it is limited |
6 | to queuing AIO operations only and essentially has a 1:1 mapping between | 6 | to queuing AIO operations only and essentially has a 1:1 mapping between |
7 | the current AioContext and the AIO ThreadPool in use. | 7 | the current AioContext and the AIO ThreadPool in use. |
8 | 8 | ||
9 | Implement generic (non-AIO) ThreadPool by essentially wrapping Glib's | 9 | Implement generic (non-AIO) ThreadPool by essentially wrapping Glib's |
10 | GThreadPool. | 10 | GThreadPool. |
11 | 11 | ||
12 | This brings a few new operations on a pool: | 12 | This brings a few new operations on a pool: |
13 | * thread_pool_wait() operation waits until all the submitted work requests | 13 | * thread_pool_wait() operation waits until all the submitted work requests |
14 | have finished. | 14 | have finished. |
15 | 15 | ||
16 | * thread_pool_set_max_threads() explicitly sets the maximum thread count | 16 | * thread_pool_set_max_threads() explicitly sets the maximum thread count |
17 | in the pool. | 17 | in the pool. |
18 | 18 | ||
19 | * thread_pool_adjust_max_threads_to_work() adjusts the maximum thread count | 19 | * thread_pool_adjust_max_threads_to_work() adjusts the maximum thread count |
20 | in the pool to equal the number of still waiting in queue or unfinished work. | 20 | in the pool to equal the number of still waiting in queue or unfinished work. |
21 | 21 | ||
22 | Reviewed-by: Fabiano Rosas <farosas@suse.de> | 22 | Reviewed-by: Fabiano Rosas <farosas@suse.de> |
23 | Reviewed-by: Peter Xu <peterx@redhat.com> | 23 | Reviewed-by: Peter Xu <peterx@redhat.com> |
24 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | 24 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> |
25 | --- | 25 | --- |
26 | include/block/thread-pool.h | 51 ++++++++++++++++ | 26 | include/block/thread-pool.h | 51 ++++++++++++++++ |
27 | util/thread-pool.c | 119 ++++++++++++++++++++++++++++++++++++ | 27 | util/thread-pool.c | 119 ++++++++++++++++++++++++++++++++++++ |
28 | 2 files changed, 170 insertions(+) | 28 | 2 files changed, 170 insertions(+) |
29 | 29 | ||
30 | diff --git a/include/block/thread-pool.h b/include/block/thread-pool.h | 30 | diff --git a/include/block/thread-pool.h b/include/block/thread-pool.h |
31 | index XXXXXXX..XXXXXXX 100644 | 31 | index XXXXXXX..XXXXXXX 100644 |
32 | --- a/include/block/thread-pool.h | 32 | --- a/include/block/thread-pool.h |
33 | +++ b/include/block/thread-pool.h | 33 | +++ b/include/block/thread-pool.h |
34 | @@ -XXX,XX +XXX,XX @@ BlockAIOCB *thread_pool_submit_aio(ThreadPoolFunc *func, void *arg, | 34 | @@ -XXX,XX +XXX,XX @@ BlockAIOCB *thread_pool_submit_aio(ThreadPoolFunc *func, void *arg, |
35 | int coroutine_fn thread_pool_submit_co(ThreadPoolFunc *func, void *arg); | 35 | int coroutine_fn thread_pool_submit_co(ThreadPoolFunc *func, void *arg); |
36 | void thread_pool_update_params(ThreadPoolAio *pool, struct AioContext *ctx); | 36 | void thread_pool_update_params(ThreadPoolAio *pool, struct AioContext *ctx); |
37 | 37 | ||
38 | +/* ------------------------------------------- */ | 38 | +/* ------------------------------------------- */ |
39 | +/* Generic thread pool types and methods below */ | 39 | +/* Generic thread pool types and methods below */ |
40 | +typedef struct ThreadPool ThreadPool; | 40 | +typedef struct ThreadPool ThreadPool; |
41 | + | 41 | + |
42 | +/* Create a new thread pool. Never returns NULL. */ | 42 | +/* Create a new thread pool. Never returns NULL. */ |
43 | +ThreadPool *thread_pool_new(void); | 43 | +ThreadPool *thread_pool_new(void); |
44 | + | 44 | + |
45 | +/* | 45 | +/* |
46 | + * Free the thread pool. | 46 | + * Free the thread pool. |
47 | + * Waits for all the previously submitted work to complete before performing | 47 | + * Waits for all the previously submitted work to complete before performing |
48 | + * the actual freeing operation. | 48 | + * the actual freeing operation. |
49 | + */ | 49 | + */ |
50 | +void thread_pool_free(ThreadPool *pool); | 50 | +void thread_pool_free(ThreadPool *pool); |
51 | + | 51 | + |
52 | +/* | 52 | +/* |
53 | + * Submit a new work (task) for the pool. | 53 | + * Submit a new work (task) for the pool. |
54 | + * | 54 | + * |
55 | + * @opaque_destroy is an optional GDestroyNotify for the @opaque argument | 55 | + * @opaque_destroy is an optional GDestroyNotify for the @opaque argument |
56 | + * to the work function at @func. | 56 | + * to the work function at @func. |
57 | + */ | 57 | + */ |
58 | +void thread_pool_submit(ThreadPool *pool, ThreadPoolFunc *func, | 58 | +void thread_pool_submit(ThreadPool *pool, ThreadPoolFunc *func, |
59 | + void *opaque, GDestroyNotify opaque_destroy); | 59 | + void *opaque, GDestroyNotify opaque_destroy); |
60 | + | 60 | + |
61 | +/* | 61 | +/* |
62 | + * Submit a new work (task) for the pool, making sure it starts getting | 62 | + * Submit a new work (task) for the pool, making sure it starts getting |
63 | + * processed immediately, launching a new thread for it if necessary. | 63 | + * processed immediately, launching a new thread for it if necessary. |
64 | + * | 64 | + * |
65 | + * @opaque_destroy is an optional GDestroyNotify for the @opaque argument | 65 | + * @opaque_destroy is an optional GDestroyNotify for the @opaque argument |
66 | + * to the work function at @func. | 66 | + * to the work function at @func. |
67 | + */ | 67 | + */ |
68 | +void thread_pool_submit_immediate(ThreadPool *pool, ThreadPoolFunc *func, | 68 | +void thread_pool_submit_immediate(ThreadPool *pool, ThreadPoolFunc *func, |
69 | + void *opaque, GDestroyNotify opaque_destroy); | 69 | + void *opaque, GDestroyNotify opaque_destroy); |
70 | + | 70 | + |
71 | +/* | 71 | +/* |
72 | + * Wait for all previously submitted work to complete before returning. | 72 | + * Wait for all previously submitted work to complete before returning. |
73 | + * | 73 | + * |
74 | + * Can be used as a barrier between two sets of tasks executed on a thread | 74 | + * Can be used as a barrier between two sets of tasks executed on a thread |
75 | + * pool without destroying it or in a performance sensitive path where the | 75 | + * pool without destroying it or in a performance sensitive path where the |
76 | + * caller just wants to wait for all tasks to complete while deferring the | 76 | + * caller just wants to wait for all tasks to complete while deferring the |
77 | + * pool free operation for later, less performance sensitive time. | 77 | + * pool free operation for later, less performance sensitive time. |
78 | + */ | 78 | + */ |
79 | +void thread_pool_wait(ThreadPool *pool); | 79 | +void thread_pool_wait(ThreadPool *pool); |
80 | + | 80 | + |
81 | +/* Set the maximum number of threads in the pool. */ | 81 | +/* Set the maximum number of threads in the pool. */ |
82 | +bool thread_pool_set_max_threads(ThreadPool *pool, int max_threads); | 82 | +bool thread_pool_set_max_threads(ThreadPool *pool, int max_threads); |
83 | + | 83 | + |
84 | +/* | 84 | +/* |
85 | + * Adjust the maximum number of threads in the pool to give each task its | 85 | + * Adjust the maximum number of threads in the pool to give each task its |
86 | + * own thread (exactly one thread per task). | 86 | + * own thread (exactly one thread per task). |
87 | + */ | 87 | + */ |
88 | +bool thread_pool_adjust_max_threads_to_work(ThreadPool *pool); | 88 | +bool thread_pool_adjust_max_threads_to_work(ThreadPool *pool); |
89 | 89 | ||
90 | #endif | 90 | #endif |
91 | diff --git a/util/thread-pool.c b/util/thread-pool.c | 91 | diff --git a/util/thread-pool.c b/util/thread-pool.c |
92 | index XXXXXXX..XXXXXXX 100644 | 92 | index XXXXXXX..XXXXXXX 100644 |
93 | --- a/util/thread-pool.c | 93 | --- a/util/thread-pool.c |
94 | +++ b/util/thread-pool.c | 94 | +++ b/util/thread-pool.c |
95 | @@ -XXX,XX +XXX,XX @@ void thread_pool_free_aio(ThreadPoolAio *pool) | 95 | @@ -XXX,XX +XXX,XX @@ void thread_pool_free_aio(ThreadPoolAio *pool) |
96 | qemu_mutex_destroy(&pool->lock); | 96 | qemu_mutex_destroy(&pool->lock); |
97 | g_free(pool); | 97 | g_free(pool); |
98 | } | 98 | } |
99 | + | 99 | + |
100 | +struct ThreadPool { | 100 | +struct ThreadPool { |
101 | + GThreadPool *t; | 101 | + GThreadPool *t; |
102 | + size_t cur_work; | 102 | + size_t cur_work; |
103 | + QemuMutex cur_work_lock; | 103 | + QemuMutex cur_work_lock; |
104 | + QemuCond all_finished_cond; | 104 | + QemuCond all_finished_cond; |
105 | +}; | 105 | +}; |
106 | + | 106 | + |
107 | +typedef struct { | 107 | +typedef struct { |
108 | + ThreadPoolFunc *func; | 108 | + ThreadPoolFunc *func; |
109 | + void *opaque; | 109 | + void *opaque; |
110 | + GDestroyNotify opaque_destroy; | 110 | + GDestroyNotify opaque_destroy; |
111 | +} ThreadPoolElement; | 111 | +} ThreadPoolElement; |
112 | + | 112 | + |
113 | +static void thread_pool_func(gpointer data, gpointer user_data) | 113 | +static void thread_pool_func(gpointer data, gpointer user_data) |
114 | +{ | 114 | +{ |
115 | + ThreadPool *pool = user_data; | 115 | + ThreadPool *pool = user_data; |
116 | + g_autofree ThreadPoolElement *el = data; | 116 | + g_autofree ThreadPoolElement *el = data; |
117 | + | 117 | + |
118 | + el->func(el->opaque); | 118 | + el->func(el->opaque); |
119 | + | 119 | + |
120 | + if (el->opaque_destroy) { | 120 | + if (el->opaque_destroy) { |
121 | + el->opaque_destroy(el->opaque); | 121 | + el->opaque_destroy(el->opaque); |
122 | + } | 122 | + } |
123 | + | 123 | + |
124 | + QEMU_LOCK_GUARD(&pool->cur_work_lock); | 124 | + QEMU_LOCK_GUARD(&pool->cur_work_lock); |
125 | + | 125 | + |
126 | + assert(pool->cur_work > 0); | 126 | + assert(pool->cur_work > 0); |
127 | + pool->cur_work--; | 127 | + pool->cur_work--; |
128 | + | 128 | + |
129 | + if (pool->cur_work == 0) { | 129 | + if (pool->cur_work == 0) { |
130 | + qemu_cond_signal(&pool->all_finished_cond); | 130 | + qemu_cond_signal(&pool->all_finished_cond); |
131 | + } | 131 | + } |
132 | +} | 132 | +} |
133 | + | 133 | + |
134 | +ThreadPool *thread_pool_new(void) | 134 | +ThreadPool *thread_pool_new(void) |
135 | +{ | 135 | +{ |
136 | + ThreadPool *pool = g_new(ThreadPool, 1); | 136 | + ThreadPool *pool = g_new(ThreadPool, 1); |
137 | + | 137 | + |
138 | + pool->cur_work = 0; | 138 | + pool->cur_work = 0; |
139 | + qemu_mutex_init(&pool->cur_work_lock); | 139 | + qemu_mutex_init(&pool->cur_work_lock); |
140 | + qemu_cond_init(&pool->all_finished_cond); | 140 | + qemu_cond_init(&pool->all_finished_cond); |
141 | + | 141 | + |
142 | + pool->t = g_thread_pool_new(thread_pool_func, pool, 0, TRUE, NULL); | 142 | + pool->t = g_thread_pool_new(thread_pool_func, pool, 0, TRUE, NULL); |
143 | + /* | 143 | + /* |
144 | + * g_thread_pool_new() can only return errors if initial thread(s) | 144 | + * g_thread_pool_new() can only return errors if initial thread(s) |
145 | + * creation fails but we ask for 0 initial threads above. | 145 | + * creation fails but we ask for 0 initial threads above. |
146 | + */ | 146 | + */ |
147 | + assert(pool->t); | 147 | + assert(pool->t); |
148 | + | 148 | + |
149 | + return pool; | 149 | + return pool; |
150 | +} | 150 | +} |
151 | + | 151 | + |
152 | +void thread_pool_free(ThreadPool *pool) | 152 | +void thread_pool_free(ThreadPool *pool) |
153 | +{ | 153 | +{ |
154 | + /* | 154 | + /* |
155 | + * With _wait = TRUE this effectively waits for all | 155 | + * With _wait = TRUE this effectively waits for all |
156 | + * previously submitted work to complete first. | 156 | + * previously submitted work to complete first. |
157 | + */ | 157 | + */ |
158 | + g_thread_pool_free(pool->t, FALSE, TRUE); | 158 | + g_thread_pool_free(pool->t, FALSE, TRUE); |
159 | + | 159 | + |
160 | + qemu_cond_destroy(&pool->all_finished_cond); | 160 | + qemu_cond_destroy(&pool->all_finished_cond); |
161 | + qemu_mutex_destroy(&pool->cur_work_lock); | 161 | + qemu_mutex_destroy(&pool->cur_work_lock); |
162 | + | 162 | + |
163 | + g_free(pool); | 163 | + g_free(pool); |
164 | +} | 164 | +} |
165 | + | 165 | + |
166 | +void thread_pool_submit(ThreadPool *pool, ThreadPoolFunc *func, | 166 | +void thread_pool_submit(ThreadPool *pool, ThreadPoolFunc *func, |
167 | + void *opaque, GDestroyNotify opaque_destroy) | 167 | + void *opaque, GDestroyNotify opaque_destroy) |
168 | +{ | 168 | +{ |
169 | + ThreadPoolElement *el = g_new(ThreadPoolElement, 1); | 169 | + ThreadPoolElement *el = g_new(ThreadPoolElement, 1); |
170 | + | 170 | + |
171 | + el->func = func; | 171 | + el->func = func; |
172 | + el->opaque = opaque; | 172 | + el->opaque = opaque; |
173 | + el->opaque_destroy = opaque_destroy; | 173 | + el->opaque_destroy = opaque_destroy; |
174 | + | 174 | + |
175 | + WITH_QEMU_LOCK_GUARD(&pool->cur_work_lock) { | 175 | + WITH_QEMU_LOCK_GUARD(&pool->cur_work_lock) { |
176 | + pool->cur_work++; | 176 | + pool->cur_work++; |
177 | + } | 177 | + } |
178 | + | 178 | + |
179 | + /* | 179 | + /* |
180 | + * Ignore the return value since this function can only return errors | 180 | + * Ignore the return value since this function can only return errors |
181 | + * if creation of an additional thread fails but even in this case the | 181 | + * if creation of an additional thread fails but even in this case the |
182 | + * provided work is still getting queued (just for the existing threads). | 182 | + * provided work is still getting queued (just for the existing threads). |
183 | + */ | 183 | + */ |
184 | + g_thread_pool_push(pool->t, el, NULL); | 184 | + g_thread_pool_push(pool->t, el, NULL); |
185 | +} | 185 | +} |
186 | + | 186 | + |
187 | +void thread_pool_submit_immediate(ThreadPool *pool, ThreadPoolFunc *func, | 187 | +void thread_pool_submit_immediate(ThreadPool *pool, ThreadPoolFunc *func, |
188 | + void *opaque, GDestroyNotify opaque_destroy) | 188 | + void *opaque, GDestroyNotify opaque_destroy) |
189 | +{ | 189 | +{ |
190 | + thread_pool_submit(pool, func, opaque, opaque_destroy); | 190 | + thread_pool_submit(pool, func, opaque, opaque_destroy); |
191 | + thread_pool_adjust_max_threads_to_work(pool); | 191 | + thread_pool_adjust_max_threads_to_work(pool); |
192 | +} | 192 | +} |
193 | + | 193 | + |
194 | +void thread_pool_wait(ThreadPool *pool) | 194 | +void thread_pool_wait(ThreadPool *pool) |
195 | +{ | 195 | +{ |
196 | + QEMU_LOCK_GUARD(&pool->cur_work_lock); | 196 | + QEMU_LOCK_GUARD(&pool->cur_work_lock); |
197 | + | 197 | + |
198 | + while (pool->cur_work > 0) { | 198 | + while (pool->cur_work > 0) { |
199 | + qemu_cond_wait(&pool->all_finished_cond, | 199 | + qemu_cond_wait(&pool->all_finished_cond, |
200 | + &pool->cur_work_lock); | 200 | + &pool->cur_work_lock); |
201 | + } | 201 | + } |
202 | +} | 202 | +} |
203 | + | 203 | + |
204 | +bool thread_pool_set_max_threads(ThreadPool *pool, | 204 | +bool thread_pool_set_max_threads(ThreadPool *pool, |
205 | + int max_threads) | 205 | + int max_threads) |
206 | +{ | 206 | +{ |
207 | + assert(max_threads > 0); | 207 | + assert(max_threads > 0); |
208 | + | 208 | + |
209 | + return g_thread_pool_set_max_threads(pool->t, max_threads, NULL); | 209 | + return g_thread_pool_set_max_threads(pool->t, max_threads, NULL); |
210 | +} | 210 | +} |
211 | + | 211 | + |
212 | +bool thread_pool_adjust_max_threads_to_work(ThreadPool *pool) | 212 | +bool thread_pool_adjust_max_threads_to_work(ThreadPool *pool) |
213 | +{ | 213 | +{ |
214 | + QEMU_LOCK_GUARD(&pool->cur_work_lock); | 214 | + QEMU_LOCK_GUARD(&pool->cur_work_lock); |
215 | + | 215 | + |
216 | + return thread_pool_set_max_threads(pool, pool->cur_work); | 216 | + return thread_pool_set_max_threads(pool, pool->cur_work); |
217 | +} | 217 | +} | diff view generated by jsdifflib |
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | 1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> |
---|---|---|---|
2 | 2 | ||
3 | This QEMU_VM_COMMAND sub-command and its switchover_start SaveVMHandler is | 3 | This QEMU_VM_COMMAND sub-command and its switchover_start SaveVMHandler is |
4 | used to mark the switchover point in main migration stream. | 4 | used to mark the switchover point in main migration stream. |
5 | 5 | ||
6 | It can be used to inform the destination that all pre-switchover main | 6 | It can be used to inform the destination that all pre-switchover main |
7 | migration stream data has been sent/received so it can start to process | 7 | migration stream data has been sent/received so it can start to process |
8 | post-switchover data that it might have received via other migration | 8 | post-switchover data that it might have received via other migration |
9 | channels like the multifd ones. | 9 | channels like the multifd ones. |
10 | 10 | ||
11 | Add also the relevant MigrationState bit stream compatibility property and | 11 | Add also the relevant MigrationState bit stream compatibility property and |
12 | its hw_compat entry. | 12 | its hw_compat entry. |
13 | 13 | ||
14 | Reviewed-by: Fabiano Rosas <farosas@suse.de> | 14 | Reviewed-by: Fabiano Rosas <farosas@suse.de> |
15 | Reviewed-by: Zhang Chen <zhangckid@gmail.com> # for the COLO part | 15 | Reviewed-by: Zhang Chen <zhangckid@gmail.com> # for the COLO part |
16 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | 16 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> |
17 | --- | 17 | --- |
18 | hw/core/machine.c | 1 + | 18 | hw/core/machine.c | 1 + |
19 | include/migration/client-options.h | 4 +++ | 19 | include/migration/client-options.h | 4 +++ |
20 | include/migration/register.h | 12 +++++++++ | 20 | include/migration/register.h | 12 +++++++++ |
21 | migration/colo.c | 3 +++ | 21 | migration/colo.c | 3 +++ |
22 | migration/migration-hmp-cmds.c | 2 ++ | 22 | migration/migration-hmp-cmds.c | 2 ++ |
23 | migration/migration.c | 2 ++ | 23 | migration/migration.c | 2 ++ |
24 | migration/migration.h | 2 ++ | 24 | migration/migration.h | 2 ++ |
25 | migration/options.c | 9 +++++++ | 25 | migration/options.c | 9 +++++++ |
26 | migration/savevm.c | 39 ++++++++++++++++++++++++++++++ | 26 | migration/savevm.c | 39 ++++++++++++++++++++++++++++++ |
27 | migration/savevm.h | 1 + | 27 | migration/savevm.h | 1 + |
28 | migration/trace-events | 1 + | 28 | migration/trace-events | 1 + |
29 | scripts/analyze-migration.py | 11 +++++++++ | 29 | scripts/analyze-migration.py | 11 +++++++++ |
30 | 12 files changed, 87 insertions(+) | 30 | 12 files changed, 87 insertions(+) |
31 | 31 | ||
32 | diff --git a/hw/core/machine.c b/hw/core/machine.c | 32 | diff --git a/hw/core/machine.c b/hw/core/machine.c |
33 | index XXXXXXX..XXXXXXX 100644 | 33 | index XXXXXXX..XXXXXXX 100644 |
34 | --- a/hw/core/machine.c | 34 | --- a/hw/core/machine.c |
35 | +++ b/hw/core/machine.c | 35 | +++ b/hw/core/machine.c |
36 | @@ -XXX,XX +XXX,XX @@ GlobalProperty hw_compat_9_2[] = { | 36 | @@ -XXX,XX +XXX,XX @@ GlobalProperty hw_compat_9_2[] = { |
37 | { "virtio-balloon-pci-non-transitional", "vectors", "0" }, | 37 | { "virtio-balloon-pci-non-transitional", "vectors", "0" }, |
38 | { "virtio-mem-pci", "vectors", "0" }, | 38 | { "virtio-mem-pci", "vectors", "0" }, |
39 | { "migration", "multifd-clean-tls-termination", "false" }, | 39 | { "migration", "multifd-clean-tls-termination", "false" }, |
40 | + { "migration", "send-switchover-start", "off"}, | 40 | + { "migration", "send-switchover-start", "off"}, |
41 | }; | 41 | }; |
42 | const size_t hw_compat_9_2_len = G_N_ELEMENTS(hw_compat_9_2); | 42 | const size_t hw_compat_9_2_len = G_N_ELEMENTS(hw_compat_9_2); |
43 | 43 | ||
44 | diff --git a/include/migration/client-options.h b/include/migration/client-options.h | 44 | diff --git a/include/migration/client-options.h b/include/migration/client-options.h |
45 | index XXXXXXX..XXXXXXX 100644 | 45 | index XXXXXXX..XXXXXXX 100644 |
46 | --- a/include/migration/client-options.h | 46 | --- a/include/migration/client-options.h |
47 | +++ b/include/migration/client-options.h | 47 | +++ b/include/migration/client-options.h |
48 | @@ -XXX,XX +XXX,XX @@ | 48 | @@ -XXX,XX +XXX,XX @@ |
49 | #ifndef QEMU_MIGRATION_CLIENT_OPTIONS_H | 49 | #ifndef QEMU_MIGRATION_CLIENT_OPTIONS_H |
50 | #define QEMU_MIGRATION_CLIENT_OPTIONS_H | 50 | #define QEMU_MIGRATION_CLIENT_OPTIONS_H |
51 | 51 | ||
52 | + | 52 | + |
53 | +/* properties */ | 53 | +/* properties */ |
54 | +bool migrate_send_switchover_start(void); | 54 | +bool migrate_send_switchover_start(void); |
55 | + | 55 | + |
56 | /* capabilities */ | 56 | /* capabilities */ |
57 | 57 | ||
58 | bool migrate_background_snapshot(void); | 58 | bool migrate_background_snapshot(void); |
59 | diff --git a/include/migration/register.h b/include/migration/register.h | 59 | diff --git a/include/migration/register.h b/include/migration/register.h |
60 | index XXXXXXX..XXXXXXX 100644 | 60 | index XXXXXXX..XXXXXXX 100644 |
61 | --- a/include/migration/register.h | 61 | --- a/include/migration/register.h |
62 | +++ b/include/migration/register.h | 62 | +++ b/include/migration/register.h |
63 | @@ -XXX,XX +XXX,XX @@ typedef struct SaveVMHandlers { | 63 | @@ -XXX,XX +XXX,XX @@ typedef struct SaveVMHandlers { |
64 | * otherwise | 64 | * otherwise |
65 | */ | 65 | */ |
66 | bool (*switchover_ack_needed)(void *opaque); | 66 | bool (*switchover_ack_needed)(void *opaque); |
67 | + | 67 | + |
68 | + /** | 68 | + /** |
69 | + * @switchover_start | 69 | + * @switchover_start |
70 | + * | 70 | + * |
71 | + * Notifies that the switchover has started. Called only on | 71 | + * Notifies that the switchover has started. Called only on |
72 | + * the destination. | 72 | + * the destination. |
73 | + * | 73 | + * |
74 | + * @opaque: data pointer passed to register_savevm_live() | 74 | + * @opaque: data pointer passed to register_savevm_live() |
75 | + * | 75 | + * |
76 | + * Returns zero to indicate success and negative for error | 76 | + * Returns zero to indicate success and negative for error |
77 | + */ | 77 | + */ |
78 | + int (*switchover_start)(void *opaque); | 78 | + int (*switchover_start)(void *opaque); |
79 | } SaveVMHandlers; | 79 | } SaveVMHandlers; |
80 | 80 | ||
81 | /** | 81 | /** |
82 | diff --git a/migration/colo.c b/migration/colo.c | 82 | diff --git a/migration/colo.c b/migration/colo.c |
83 | index XXXXXXX..XXXXXXX 100644 | 83 | index XXXXXXX..XXXXXXX 100644 |
84 | --- a/migration/colo.c | 84 | --- a/migration/colo.c |
85 | +++ b/migration/colo.c | 85 | +++ b/migration/colo.c |
86 | @@ -XXX,XX +XXX,XX @@ static int colo_do_checkpoint_transaction(MigrationState *s, | 86 | @@ -XXX,XX +XXX,XX @@ static int colo_do_checkpoint_transaction(MigrationState *s, |
87 | bql_unlock(); | 87 | bql_unlock(); |
88 | goto out; | 88 | goto out; |
89 | } | 89 | } |
90 | + | 90 | + |
91 | + qemu_savevm_maybe_send_switchover_start(s->to_dst_file); | 91 | + qemu_savevm_maybe_send_switchover_start(s->to_dst_file); |
92 | + | 92 | + |
93 | /* Note: device state is saved into buffer */ | 93 | /* Note: device state is saved into buffer */ |
94 | ret = qemu_save_device_state(fb); | 94 | ret = qemu_save_device_state(fb); |
95 | 95 | ||
96 | diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c | 96 | diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c |
97 | index XXXXXXX..XXXXXXX 100644 | 97 | index XXXXXXX..XXXXXXX 100644 |
98 | --- a/migration/migration-hmp-cmds.c | 98 | --- a/migration/migration-hmp-cmds.c |
99 | +++ b/migration/migration-hmp-cmds.c | 99 | +++ b/migration/migration-hmp-cmds.c |
100 | @@ -XXX,XX +XXX,XX @@ static void migration_global_dump(Monitor *mon) | 100 | @@ -XXX,XX +XXX,XX @@ static void migration_global_dump(Monitor *mon) |
101 | ms->send_configuration ? "on" : "off"); | 101 | ms->send_configuration ? "on" : "off"); |
102 | monitor_printf(mon, "send-section-footer: %s\n", | 102 | monitor_printf(mon, "send-section-footer: %s\n", |
103 | ms->send_section_footer ? "on" : "off"); | 103 | ms->send_section_footer ? "on" : "off"); |
104 | + monitor_printf(mon, "send-switchover-start: %s\n", | 104 | + monitor_printf(mon, "send-switchover-start: %s\n", |
105 | + ms->send_switchover_start ? "on" : "off"); | 105 | + ms->send_switchover_start ? "on" : "off"); |
106 | monitor_printf(mon, "clear-bitmap-shift: %u\n", | 106 | monitor_printf(mon, "clear-bitmap-shift: %u\n", |
107 | ms->clear_bitmap_shift); | 107 | ms->clear_bitmap_shift); |
108 | } | 108 | } |
109 | diff --git a/migration/migration.c b/migration/migration.c | 109 | diff --git a/migration/migration.c b/migration/migration.c |
110 | index XXXXXXX..XXXXXXX 100644 | 110 | index XXXXXXX..XXXXXXX 100644 |
111 | --- a/migration/migration.c | 111 | --- a/migration/migration.c |
112 | +++ b/migration/migration.c | 112 | +++ b/migration/migration.c |
113 | @@ -XXX,XX +XXX,XX @@ static bool migration_switchover_start(MigrationState *s, Error **errp) | 113 | @@ -XXX,XX +XXX,XX @@ static bool migration_switchover_start(MigrationState *s, Error **errp) |
114 | 114 | ||
115 | precopy_notify_complete(); | 115 | precopy_notify_complete(); |
116 | 116 | ||
117 | + qemu_savevm_maybe_send_switchover_start(s->to_dst_file); | 117 | + qemu_savevm_maybe_send_switchover_start(s->to_dst_file); |
118 | + | 118 | + |
119 | return true; | 119 | return true; |
120 | } | 120 | } |
121 | 121 | ||
122 | diff --git a/migration/migration.h b/migration/migration.h | 122 | diff --git a/migration/migration.h b/migration/migration.h |
123 | index XXXXXXX..XXXXXXX 100644 | 123 | index XXXXXXX..XXXXXXX 100644 |
124 | --- a/migration/migration.h | 124 | --- a/migration/migration.h |
125 | +++ b/migration/migration.h | 125 | +++ b/migration/migration.h |
126 | @@ -XXX,XX +XXX,XX @@ struct MigrationState { | 126 | @@ -XXX,XX +XXX,XX @@ struct MigrationState { |
127 | bool send_configuration; | 127 | bool send_configuration; |
128 | /* Whether we send section footer during migration */ | 128 | /* Whether we send section footer during migration */ |
129 | bool send_section_footer; | 129 | bool send_section_footer; |
130 | + /* Whether we send switchover start notification during migration */ | 130 | + /* Whether we send switchover start notification during migration */ |
131 | + bool send_switchover_start; | 131 | + bool send_switchover_start; |
132 | 132 | ||
133 | /* Needed by postcopy-pause state */ | 133 | /* Needed by postcopy-pause state */ |
134 | QemuSemaphore postcopy_pause_sem; | 134 | QemuSemaphore postcopy_pause_sem; |
135 | diff --git a/migration/options.c b/migration/options.c | 135 | diff --git a/migration/options.c b/migration/options.c |
136 | index XXXXXXX..XXXXXXX 100644 | 136 | index XXXXXXX..XXXXXXX 100644 |
137 | --- a/migration/options.c | 137 | --- a/migration/options.c |
138 | +++ b/migration/options.c | 138 | +++ b/migration/options.c |
139 | @@ -XXX,XX +XXX,XX @@ const Property migration_properties[] = { | 139 | @@ -XXX,XX +XXX,XX @@ const Property migration_properties[] = { |
140 | send_configuration, true), | 140 | send_configuration, true), |
141 | DEFINE_PROP_BOOL("send-section-footer", MigrationState, | 141 | DEFINE_PROP_BOOL("send-section-footer", MigrationState, |
142 | send_section_footer, true), | 142 | send_section_footer, true), |
143 | + DEFINE_PROP_BOOL("send-switchover-start", MigrationState, | 143 | + DEFINE_PROP_BOOL("send-switchover-start", MigrationState, |
144 | + send_switchover_start, true), | 144 | + send_switchover_start, true), |
145 | DEFINE_PROP_BOOL("multifd-flush-after-each-section", MigrationState, | 145 | DEFINE_PROP_BOOL("multifd-flush-after-each-section", MigrationState, |
146 | multifd_flush_after_each_section, false), | 146 | multifd_flush_after_each_section, false), |
147 | DEFINE_PROP_UINT8("x-clear-bitmap-shift", MigrationState, | 147 | DEFINE_PROP_UINT8("x-clear-bitmap-shift", MigrationState, |
148 | @@ -XXX,XX +XXX,XX @@ bool migrate_auto_converge(void) | 148 | @@ -XXX,XX +XXX,XX @@ bool migrate_auto_converge(void) |
149 | return s->capabilities[MIGRATION_CAPABILITY_AUTO_CONVERGE]; | 149 | return s->capabilities[MIGRATION_CAPABILITY_AUTO_CONVERGE]; |
150 | } | 150 | } |
151 | 151 | ||
152 | +bool migrate_send_switchover_start(void) | 152 | +bool migrate_send_switchover_start(void) |
153 | +{ | 153 | +{ |
154 | + MigrationState *s = migrate_get_current(); | 154 | + MigrationState *s = migrate_get_current(); |
155 | + | 155 | + |
156 | + return s->send_switchover_start; | 156 | + return s->send_switchover_start; |
157 | +} | 157 | +} |
158 | + | 158 | + |
159 | bool migrate_background_snapshot(void) | 159 | bool migrate_background_snapshot(void) |
160 | { | 160 | { |
161 | MigrationState *s = migrate_get_current(); | 161 | MigrationState *s = migrate_get_current(); |
162 | diff --git a/migration/savevm.c b/migration/savevm.c | 162 | diff --git a/migration/savevm.c b/migration/savevm.c |
163 | index XXXXXXX..XXXXXXX 100644 | 163 | index XXXXXXX..XXXXXXX 100644 |
164 | --- a/migration/savevm.c | 164 | --- a/migration/savevm.c |
165 | +++ b/migration/savevm.c | 165 | +++ b/migration/savevm.c |
166 | @@ -XXX,XX +XXX,XX @@ enum qemu_vm_cmd { | 166 | @@ -XXX,XX +XXX,XX @@ enum qemu_vm_cmd { |
167 | MIG_CMD_ENABLE_COLO, /* Enable COLO */ | 167 | MIG_CMD_ENABLE_COLO, /* Enable COLO */ |
168 | MIG_CMD_POSTCOPY_RESUME, /* resume postcopy on dest */ | 168 | MIG_CMD_POSTCOPY_RESUME, /* resume postcopy on dest */ |
169 | MIG_CMD_RECV_BITMAP, /* Request for recved bitmap on dst */ | 169 | MIG_CMD_RECV_BITMAP, /* Request for recved bitmap on dst */ |
170 | + MIG_CMD_SWITCHOVER_START, /* Switchover start notification */ | 170 | + MIG_CMD_SWITCHOVER_START, /* Switchover start notification */ |
171 | MIG_CMD_MAX | 171 | MIG_CMD_MAX |
172 | }; | 172 | }; |
173 | 173 | ||
174 | @@ -XXX,XX +XXX,XX @@ static struct mig_cmd_args { | 174 | @@ -XXX,XX +XXX,XX @@ static struct mig_cmd_args { |
175 | [MIG_CMD_POSTCOPY_RESUME] = { .len = 0, .name = "POSTCOPY_RESUME" }, | 175 | [MIG_CMD_POSTCOPY_RESUME] = { .len = 0, .name = "POSTCOPY_RESUME" }, |
176 | [MIG_CMD_PACKAGED] = { .len = 4, .name = "PACKAGED" }, | 176 | [MIG_CMD_PACKAGED] = { .len = 4, .name = "PACKAGED" }, |
177 | [MIG_CMD_RECV_BITMAP] = { .len = -1, .name = "RECV_BITMAP" }, | 177 | [MIG_CMD_RECV_BITMAP] = { .len = -1, .name = "RECV_BITMAP" }, |
178 | + [MIG_CMD_SWITCHOVER_START] = { .len = 0, .name = "SWITCHOVER_START" }, | 178 | + [MIG_CMD_SWITCHOVER_START] = { .len = 0, .name = "SWITCHOVER_START" }, |
179 | [MIG_CMD_MAX] = { .len = -1, .name = "MAX" }, | 179 | [MIG_CMD_MAX] = { .len = -1, .name = "MAX" }, |
180 | }; | 180 | }; |
181 | 181 | ||
182 | @@ -XXX,XX +XXX,XX @@ void qemu_savevm_send_recv_bitmap(QEMUFile *f, char *block_name) | 182 | @@ -XXX,XX +XXX,XX @@ void qemu_savevm_send_recv_bitmap(QEMUFile *f, char *block_name) |
183 | qemu_savevm_command_send(f, MIG_CMD_RECV_BITMAP, len + 1, (uint8_t *)buf); | 183 | qemu_savevm_command_send(f, MIG_CMD_RECV_BITMAP, len + 1, (uint8_t *)buf); |
184 | } | 184 | } |
185 | 185 | ||
186 | +static void qemu_savevm_send_switchover_start(QEMUFile *f) | 186 | +static void qemu_savevm_send_switchover_start(QEMUFile *f) |
187 | +{ | 187 | +{ |
188 | + trace_savevm_send_switchover_start(); | 188 | + trace_savevm_send_switchover_start(); |
189 | + qemu_savevm_command_send(f, MIG_CMD_SWITCHOVER_START, 0, NULL); | 189 | + qemu_savevm_command_send(f, MIG_CMD_SWITCHOVER_START, 0, NULL); |
190 | +} | 190 | +} |
191 | + | 191 | + |
192 | +void qemu_savevm_maybe_send_switchover_start(QEMUFile *f) | 192 | +void qemu_savevm_maybe_send_switchover_start(QEMUFile *f) |
193 | +{ | 193 | +{ |
194 | + if (migrate_send_switchover_start()) { | 194 | + if (migrate_send_switchover_start()) { |
195 | + qemu_savevm_send_switchover_start(f); | 195 | + qemu_savevm_send_switchover_start(f); |
196 | + } | 196 | + } |
197 | +} | 197 | +} |
198 | + | 198 | + |
199 | bool qemu_savevm_state_blocked(Error **errp) | 199 | bool qemu_savevm_state_blocked(Error **errp) |
200 | { | 200 | { |
201 | SaveStateEntry *se; | 201 | SaveStateEntry *se; |
202 | @@ -XXX,XX +XXX,XX @@ static int qemu_savevm_state(QEMUFile *f, Error **errp) | 202 | @@ -XXX,XX +XXX,XX @@ static int qemu_savevm_state(QEMUFile *f, Error **errp) |
203 | 203 | ||
204 | ret = qemu_file_get_error(f); | 204 | ret = qemu_file_get_error(f); |
205 | if (ret == 0) { | 205 | if (ret == 0) { |
206 | + qemu_savevm_maybe_send_switchover_start(f); | 206 | + qemu_savevm_maybe_send_switchover_start(f); |
207 | qemu_savevm_state_complete_precopy(f, false); | 207 | qemu_savevm_state_complete_precopy(f, false); |
208 | ret = qemu_file_get_error(f); | 208 | ret = qemu_file_get_error(f); |
209 | } | 209 | } |
210 | @@ -XXX,XX +XXX,XX @@ static int loadvm_process_enable_colo(MigrationIncomingState *mis) | 210 | @@ -XXX,XX +XXX,XX @@ static int loadvm_process_enable_colo(MigrationIncomingState *mis) |
211 | return ret; | 211 | return ret; |
212 | } | 212 | } |
213 | 213 | ||
214 | +static int loadvm_postcopy_handle_switchover_start(void) | 214 | +static int loadvm_postcopy_handle_switchover_start(void) |
215 | +{ | 215 | +{ |
216 | + SaveStateEntry *se; | 216 | + SaveStateEntry *se; |
217 | + | 217 | + |
218 | + QTAILQ_FOREACH(se, &savevm_state.handlers, entry) { | 218 | + QTAILQ_FOREACH(se, &savevm_state.handlers, entry) { |
219 | + int ret; | 219 | + int ret; |
220 | + | 220 | + |
221 | + if (!se->ops || !se->ops->switchover_start) { | 221 | + if (!se->ops || !se->ops->switchover_start) { |
222 | + continue; | 222 | + continue; |
223 | + } | 223 | + } |
224 | + | 224 | + |
225 | + ret = se->ops->switchover_start(se->opaque); | 225 | + ret = se->ops->switchover_start(se->opaque); |
226 | + if (ret < 0) { | 226 | + if (ret < 0) { |
227 | + return ret; | 227 | + return ret; |
228 | + } | 228 | + } |
229 | + } | 229 | + } |
230 | + | 230 | + |
231 | + return 0; | 231 | + return 0; |
232 | +} | 232 | +} |
233 | + | 233 | + |
234 | /* | 234 | /* |
235 | * Process an incoming 'QEMU_VM_COMMAND' | 235 | * Process an incoming 'QEMU_VM_COMMAND' |
236 | * 0 just a normal return | 236 | * 0 just a normal return |
237 | @@ -XXX,XX +XXX,XX @@ static int loadvm_process_command(QEMUFile *f) | 237 | @@ -XXX,XX +XXX,XX @@ static int loadvm_process_command(QEMUFile *f) |
238 | 238 | ||
239 | case MIG_CMD_ENABLE_COLO: | 239 | case MIG_CMD_ENABLE_COLO: |
240 | return loadvm_process_enable_colo(mis); | 240 | return loadvm_process_enable_colo(mis); |
241 | + | 241 | + |
242 | + case MIG_CMD_SWITCHOVER_START: | 242 | + case MIG_CMD_SWITCHOVER_START: |
243 | + return loadvm_postcopy_handle_switchover_start(); | 243 | + return loadvm_postcopy_handle_switchover_start(); |
244 | } | 244 | } |
245 | 245 | ||
246 | return 0; | 246 | return 0; |
247 | diff --git a/migration/savevm.h b/migration/savevm.h | 247 | diff --git a/migration/savevm.h b/migration/savevm.h |
248 | index XXXXXXX..XXXXXXX 100644 | 248 | index XXXXXXX..XXXXXXX 100644 |
249 | --- a/migration/savevm.h | 249 | --- a/migration/savevm.h |
250 | +++ b/migration/savevm.h | 250 | +++ b/migration/savevm.h |
251 | @@ -XXX,XX +XXX,XX @@ void qemu_savevm_send_postcopy_listen(QEMUFile *f); | 251 | @@ -XXX,XX +XXX,XX @@ void qemu_savevm_send_postcopy_listen(QEMUFile *f); |
252 | void qemu_savevm_send_postcopy_run(QEMUFile *f); | 252 | void qemu_savevm_send_postcopy_run(QEMUFile *f); |
253 | void qemu_savevm_send_postcopy_resume(QEMUFile *f); | 253 | void qemu_savevm_send_postcopy_resume(QEMUFile *f); |
254 | void qemu_savevm_send_recv_bitmap(QEMUFile *f, char *block_name); | 254 | void qemu_savevm_send_recv_bitmap(QEMUFile *f, char *block_name); |
255 | +void qemu_savevm_maybe_send_switchover_start(QEMUFile *f); | 255 | +void qemu_savevm_maybe_send_switchover_start(QEMUFile *f); |
256 | 256 | ||
257 | void qemu_savevm_send_postcopy_ram_discard(QEMUFile *f, const char *name, | 257 | void qemu_savevm_send_postcopy_ram_discard(QEMUFile *f, const char *name, |
258 | uint16_t len, | 258 | uint16_t len, |
259 | diff --git a/migration/trace-events b/migration/trace-events | 259 | diff --git a/migration/trace-events b/migration/trace-events |
260 | index XXXXXXX..XXXXXXX 100644 | 260 | index XXXXXXX..XXXXXXX 100644 |
261 | --- a/migration/trace-events | 261 | --- a/migration/trace-events |
262 | +++ b/migration/trace-events | 262 | +++ b/migration/trace-events |
263 | @@ -XXX,XX +XXX,XX @@ savevm_send_postcopy_run(void) "" | 263 | @@ -XXX,XX +XXX,XX @@ savevm_send_postcopy_run(void) "" |
264 | savevm_send_postcopy_resume(void) "" | 264 | savevm_send_postcopy_resume(void) "" |
265 | savevm_send_colo_enable(void) "" | 265 | savevm_send_colo_enable(void) "" |
266 | savevm_send_recv_bitmap(char *name) "%s" | 266 | savevm_send_recv_bitmap(char *name) "%s" |
267 | +savevm_send_switchover_start(void) "" | 267 | +savevm_send_switchover_start(void) "" |
268 | savevm_state_setup(void) "" | 268 | savevm_state_setup(void) "" |
269 | savevm_state_resume_prepare(void) "" | 269 | savevm_state_resume_prepare(void) "" |
270 | savevm_state_header(void) "" | 270 | savevm_state_header(void) "" |
271 | diff --git a/scripts/analyze-migration.py b/scripts/analyze-migration.py | 271 | diff --git a/scripts/analyze-migration.py b/scripts/analyze-migration.py |
272 | index XXXXXXX..XXXXXXX 100755 | 272 | index XXXXXXX..XXXXXXX 100755 |
273 | --- a/scripts/analyze-migration.py | 273 | --- a/scripts/analyze-migration.py |
274 | +++ b/scripts/analyze-migration.py | 274 | +++ b/scripts/analyze-migration.py |
275 | @@ -XXX,XX +XXX,XX @@ class MigrationDump(object): | 275 | @@ -XXX,XX +XXX,XX @@ class MigrationDump(object): |
276 | QEMU_VM_SUBSECTION = 0x05 | 276 | QEMU_VM_SUBSECTION = 0x05 |
277 | QEMU_VM_VMDESCRIPTION = 0x06 | 277 | QEMU_VM_VMDESCRIPTION = 0x06 |
278 | QEMU_VM_CONFIGURATION = 0x07 | 278 | QEMU_VM_CONFIGURATION = 0x07 |
279 | + QEMU_VM_COMMAND = 0x08 | 279 | + QEMU_VM_COMMAND = 0x08 |
280 | QEMU_VM_SECTION_FOOTER= 0x7e | 280 | QEMU_VM_SECTION_FOOTER= 0x7e |
281 | + QEMU_MIG_CMD_SWITCHOVER_START = 0x0b | 281 | + QEMU_MIG_CMD_SWITCHOVER_START = 0x0b |
282 | 282 | ||
283 | def __init__(self, filename): | 283 | def __init__(self, filename): |
284 | self.section_classes = { | 284 | self.section_classes = { |
285 | @@ -XXX,XX +XXX,XX @@ def read(self, desc_only = False, dump_memory = False, | 285 | @@ -XXX,XX +XXX,XX @@ def read(self, desc_only = False, dump_memory = False, |
286 | elif section_type == self.QEMU_VM_SECTION_PART or section_type == self.QEMU_VM_SECTION_END: | 286 | elif section_type == self.QEMU_VM_SECTION_PART or section_type == self.QEMU_VM_SECTION_END: |
287 | section_id = file.read32() | 287 | section_id = file.read32() |
288 | self.sections[section_id].read() | 288 | self.sections[section_id].read() |
289 | + elif section_type == self.QEMU_VM_COMMAND: | 289 | + elif section_type == self.QEMU_VM_COMMAND: |
290 | + command_type = file.read16() | 290 | + command_type = file.read16() |
291 | + command_data_len = file.read16() | 291 | + command_data_len = file.read16() |
292 | + if command_type != self.QEMU_MIG_CMD_SWITCHOVER_START: | 292 | + if command_type != self.QEMU_MIG_CMD_SWITCHOVER_START: |
293 | + raise Exception("Unknown QEMU_VM_COMMAND: %x" % | 293 | + raise Exception("Unknown QEMU_VM_COMMAND: %x" % |
294 | + (command_type)) | 294 | + (command_type)) |
295 | + if command_data_len != 0: | 295 | + if command_data_len != 0: |
296 | + raise Exception("Invalid SWITCHOVER_START length: %x" % | 296 | + raise Exception("Invalid SWITCHOVER_START length: %x" % |
297 | + (command_data_len)) | 297 | + (command_data_len)) |
298 | elif section_type == self.QEMU_VM_SECTION_FOOTER: | 298 | elif section_type == self.QEMU_VM_SECTION_FOOTER: |
299 | read_section_id = file.read32() | 299 | read_section_id = file.read32() |
300 | if read_section_id != section_id: | 300 | if read_section_id != section_id: | diff view generated by jsdifflib |
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | 1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> |
---|---|---|---|
2 | 2 | ||
3 | qemu_loadvm_load_state_buffer() and its load_state_buffer | 3 | qemu_loadvm_load_state_buffer() and its load_state_buffer |
4 | SaveVMHandler allow providing device state buffer to explicitly | 4 | SaveVMHandler allow providing device state buffer to explicitly |
5 | specified device via its idstr and instance id. | 5 | specified device via its idstr and instance id. |
6 | 6 | ||
7 | Reviewed-by: Fabiano Rosas <farosas@suse.de> | 7 | Reviewed-by: Fabiano Rosas <farosas@suse.de> |
8 | Reviewed-by: Peter Xu <peterx@redhat.com> | 8 | Reviewed-by: Peter Xu <peterx@redhat.com> |
9 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | 9 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> |
10 | --- | 10 | --- |
11 | include/migration/register.h | 15 +++++++++++++++ | 11 | include/migration/register.h | 15 +++++++++++++++ |
12 | migration/savevm.c | 23 +++++++++++++++++++++++ | 12 | migration/savevm.c | 23 +++++++++++++++++++++++ |
13 | migration/savevm.h | 3 +++ | 13 | migration/savevm.h | 3 +++ |
14 | 3 files changed, 41 insertions(+) | 14 | 3 files changed, 41 insertions(+) |
15 | 15 | ||
16 | diff --git a/include/migration/register.h b/include/migration/register.h | 16 | diff --git a/include/migration/register.h b/include/migration/register.h |
17 | index XXXXXXX..XXXXXXX 100644 | 17 | index XXXXXXX..XXXXXXX 100644 |
18 | --- a/include/migration/register.h | 18 | --- a/include/migration/register.h |
19 | +++ b/include/migration/register.h | 19 | +++ b/include/migration/register.h |
20 | @@ -XXX,XX +XXX,XX @@ typedef struct SaveVMHandlers { | 20 | @@ -XXX,XX +XXX,XX @@ typedef struct SaveVMHandlers { |
21 | */ | 21 | */ |
22 | int (*load_state)(QEMUFile *f, void *opaque, int version_id); | 22 | int (*load_state)(QEMUFile *f, void *opaque, int version_id); |
23 | 23 | ||
24 | + /** | 24 | + /** |
25 | + * @load_state_buffer (invoked outside the BQL) | 25 | + * @load_state_buffer (invoked outside the BQL) |
26 | + * | 26 | + * |
27 | + * Load device state buffer provided to qemu_loadvm_load_state_buffer(). | 27 | + * Load device state buffer provided to qemu_loadvm_load_state_buffer(). |
28 | + * | 28 | + * |
29 | + * @opaque: data pointer passed to register_savevm_live() | 29 | + * @opaque: data pointer passed to register_savevm_live() |
30 | + * @buf: the data buffer to load | 30 | + * @buf: the data buffer to load |
31 | + * @len: the data length in buffer | 31 | + * @len: the data length in buffer |
32 | + * @errp: pointer to Error*, to store an error if it happens. | 32 | + * @errp: pointer to Error*, to store an error if it happens. |
33 | + * | 33 | + * |
34 | + * Returns true to indicate success and false for errors. | 34 | + * Returns true to indicate success and false for errors. |
35 | + */ | 35 | + */ |
36 | + bool (*load_state_buffer)(void *opaque, char *buf, size_t len, | 36 | + bool (*load_state_buffer)(void *opaque, char *buf, size_t len, |
37 | + Error **errp); | 37 | + Error **errp); |
38 | + | 38 | + |
39 | /** | 39 | /** |
40 | * @load_setup | 40 | * @load_setup |
41 | * | 41 | * |
42 | diff --git a/migration/savevm.c b/migration/savevm.c | 42 | diff --git a/migration/savevm.c b/migration/savevm.c |
43 | index XXXXXXX..XXXXXXX 100644 | 43 | index XXXXXXX..XXXXXXX 100644 |
44 | --- a/migration/savevm.c | 44 | --- a/migration/savevm.c |
45 | +++ b/migration/savevm.c | 45 | +++ b/migration/savevm.c |
46 | @@ -XXX,XX +XXX,XX @@ int qemu_loadvm_approve_switchover(void) | 46 | @@ -XXX,XX +XXX,XX @@ int qemu_loadvm_approve_switchover(void) |
47 | return migrate_send_rp_switchover_ack(mis); | 47 | return migrate_send_rp_switchover_ack(mis); |
48 | } | 48 | } |
49 | 49 | ||
50 | +bool qemu_loadvm_load_state_buffer(const char *idstr, uint32_t instance_id, | 50 | +bool qemu_loadvm_load_state_buffer(const char *idstr, uint32_t instance_id, |
51 | + char *buf, size_t len, Error **errp) | 51 | + char *buf, size_t len, Error **errp) |
52 | +{ | 52 | +{ |
53 | + SaveStateEntry *se; | 53 | + SaveStateEntry *se; |
54 | + | 54 | + |
55 | + se = find_se(idstr, instance_id); | 55 | + se = find_se(idstr, instance_id); |
56 | + if (!se) { | 56 | + if (!se) { |
57 | + error_setg(errp, | 57 | + error_setg(errp, |
58 | + "Unknown idstr %s or instance id %u for load state buffer", | 58 | + "Unknown idstr %s or instance id %u for load state buffer", |
59 | + idstr, instance_id); | 59 | + idstr, instance_id); |
60 | + return false; | 60 | + return false; |
61 | + } | 61 | + } |
62 | + | 62 | + |
63 | + if (!se->ops || !se->ops->load_state_buffer) { | 63 | + if (!se->ops || !se->ops->load_state_buffer) { |
64 | + error_setg(errp, | 64 | + error_setg(errp, |
65 | + "idstr %s / instance %u has no load state buffer operation", | 65 | + "idstr %s / instance %u has no load state buffer operation", |
66 | + idstr, instance_id); | 66 | + idstr, instance_id); |
67 | + return false; | 67 | + return false; |
68 | + } | 68 | + } |
69 | + | 69 | + |
70 | + return se->ops->load_state_buffer(se->opaque, buf, len, errp); | 70 | + return se->ops->load_state_buffer(se->opaque, buf, len, errp); |
71 | +} | 71 | +} |
72 | + | 72 | + |
73 | bool save_snapshot(const char *name, bool overwrite, const char *vmstate, | 73 | bool save_snapshot(const char *name, bool overwrite, const char *vmstate, |
74 | bool has_devices, strList *devices, Error **errp) | 74 | bool has_devices, strList *devices, Error **errp) |
75 | { | 75 | { |
76 | diff --git a/migration/savevm.h b/migration/savevm.h | 76 | diff --git a/migration/savevm.h b/migration/savevm.h |
77 | index XXXXXXX..XXXXXXX 100644 | 77 | index XXXXXXX..XXXXXXX 100644 |
78 | --- a/migration/savevm.h | 78 | --- a/migration/savevm.h |
79 | +++ b/migration/savevm.h | 79 | +++ b/migration/savevm.h |
80 | @@ -XXX,XX +XXX,XX @@ int qemu_loadvm_approve_switchover(void); | 80 | @@ -XXX,XX +XXX,XX @@ int qemu_loadvm_approve_switchover(void); |
81 | int qemu_savevm_state_complete_precopy_non_iterable(QEMUFile *f, | 81 | int qemu_savevm_state_complete_precopy_non_iterable(QEMUFile *f, |
82 | bool in_postcopy); | 82 | bool in_postcopy); |
83 | 83 | ||
84 | +bool qemu_loadvm_load_state_buffer(const char *idstr, uint32_t instance_id, | 84 | +bool qemu_loadvm_load_state_buffer(const char *idstr, uint32_t instance_id, |
85 | + char *buf, size_t len, Error **errp); | 85 | + char *buf, size_t len, Error **errp); |
86 | + | 86 | + |
87 | #endif | 87 | #endif | diff view generated by jsdifflib |
... | ... | ||
---|---|---|---|
23 | Add a TODO comment there so it's clear that it would be good to improve | 23 | Add a TODO comment there so it's clear that it would be good to improve |
24 | handling of such (erroneous) case in the future. | 24 | handling of such (erroneous) case in the future. |
25 | 25 | ||
26 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | 26 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> |
27 | --- | 27 | --- |
28 | migration/savevm.c | 4 ++++ | 28 | migration/migration.c | 16 ++++++++++++++++ |
29 | 1 file changed, 4 insertions(+) | 29 | migration/savevm.c | 4 ++++ |
30 | 2 files changed, 20 insertions(+) | ||
30 | 31 | ||
32 | diff --git a/migration/migration.c b/migration/migration.c | ||
33 | index XXXXXXX..XXXXXXX 100644 | ||
34 | --- a/migration/migration.c | ||
35 | +++ b/migration/migration.c | ||
36 | @@ -XXX,XX +XXX,XX @@ void migration_incoming_state_destroy(void) | ||
37 | struct MigrationIncomingState *mis = migration_incoming_get_current(); | ||
38 | |||
39 | multifd_recv_cleanup(); | ||
40 | + | ||
41 | /* | ||
42 | * RAM state cleanup needs to happen after multifd cleanup, because | ||
43 | * multifd threads can use some of its states (receivedmap). | ||
44 | + * | ||
45 | + * This call also needs BQL held since it calls all registered | ||
46 | + * load_cleanup SaveVMHandlers and at least the VFIO implementation is | ||
47 | + * BQL-sensitive. | ||
48 | + * | ||
49 | + * In addition to the above, it also performs cleanup of load threads | ||
50 | + * thread pool. | ||
51 | + * This cleanup operation is BQL-sensitive as it requires unlocking BQL | ||
52 | + * so a thread possibly waiting for it could get unblocked and finally | ||
53 | + * exit. | ||
54 | + * The reason why a load thread may need to hold BQL in the first place | ||
55 | + * is because address space modification operations require it. | ||
56 | + * | ||
57 | + * Check proper BQL state here rather than risk possible deadlock later. | ||
58 | */ | ||
59 | + assert(bql_locked()); | ||
60 | qemu_loadvm_state_cleanup(); | ||
61 | |||
62 | if (mis->to_src_file) { | ||
31 | diff --git a/migration/savevm.c b/migration/savevm.c | 63 | diff --git a/migration/savevm.c b/migration/savevm.c |
32 | index XXXXXXX..XXXXXXX 100644 | 64 | index XXXXXXX..XXXXXXX 100644 |
33 | --- a/migration/savevm.c | 65 | --- a/migration/savevm.c |
34 | +++ b/migration/savevm.c | 66 | +++ b/migration/savevm.c |
35 | @@ -XXX,XX +XXX,XX @@ static void *postcopy_ram_listen_thread(void *opaque) | 67 | @@ -XXX,XX +XXX,XX @@ static void *postcopy_ram_listen_thread(void *opaque) |
... | ... | diff view generated by jsdifflib |
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | 1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> |
---|---|---|---|
2 | 2 | ||
3 | Automatic memory management helps avoid memory safety issues. | 3 | Automatic memory management helps avoid memory safety issues. |
4 | 4 | ||
5 | Reviewed-by: Peter Xu <peterx@redhat.com> | 5 | Reviewed-by: Peter Xu <peterx@redhat.com> |
6 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | 6 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> |
7 | --- | 7 | --- |
8 | include/qapi/error.h | 2 ++ | 8 | include/qapi/error.h | 2 ++ |
9 | 1 file changed, 2 insertions(+) | 9 | 1 file changed, 2 insertions(+) |
10 | 10 | ||
11 | diff --git a/include/qapi/error.h b/include/qapi/error.h | 11 | diff --git a/include/qapi/error.h b/include/qapi/error.h |
12 | index XXXXXXX..XXXXXXX 100644 | 12 | index XXXXXXX..XXXXXXX 100644 |
13 | --- a/include/qapi/error.h | 13 | --- a/include/qapi/error.h |
14 | +++ b/include/qapi/error.h | 14 | +++ b/include/qapi/error.h |
15 | @@ -XXX,XX +XXX,XX @@ Error *error_copy(const Error *err); | 15 | @@ -XXX,XX +XXX,XX @@ Error *error_copy(const Error *err); |
16 | */ | 16 | */ |
17 | void error_free(Error *err); | 17 | void error_free(Error *err); |
18 | 18 | ||
19 | +G_DEFINE_AUTOPTR_CLEANUP_FUNC(Error, error_free) | 19 | +G_DEFINE_AUTOPTR_CLEANUP_FUNC(Error, error_free) |
20 | + | 20 | + |
21 | /* | 21 | /* |
22 | * Convenience function to assert that *@errp is set, then silently free it. | 22 | * Convenience function to assert that *@errp is set, then silently free it. |
23 | */ | 23 | */ | diff view generated by jsdifflib |
... | ... | ||
---|---|---|---|
14 | --- | 14 | --- |
15 | include/migration/misc.h | 3 ++ | 15 | include/migration/misc.h | 3 ++ |
16 | include/qemu/typedefs.h | 2 + | 16 | include/qemu/typedefs.h | 2 + |
17 | migration/migration.c | 2 +- | 17 | migration/migration.c | 2 +- |
18 | migration/migration.h | 5 +++ | 18 | migration/migration.h | 5 +++ |
19 | migration/savevm.c | 89 +++++++++++++++++++++++++++++++++++++++- | 19 | migration/savevm.c | 95 +++++++++++++++++++++++++++++++++++++++- |
20 | migration/savevm.h | 2 +- | 20 | migration/savevm.h | 2 +- |
21 | 6 files changed, 99 insertions(+), 4 deletions(-) | 21 | 6 files changed, 105 insertions(+), 4 deletions(-) |
22 | 22 | ||
23 | diff --git a/include/migration/misc.h b/include/migration/misc.h | 23 | diff --git a/include/migration/misc.h b/include/migration/misc.h |
24 | index XXXXXXX..XXXXXXX 100644 | 24 | index XXXXXXX..XXXXXXX 100644 |
25 | --- a/include/migration/misc.h | 25 | --- a/include/migration/misc.h |
26 | +++ b/include/migration/misc.h | 26 | +++ b/include/migration/misc.h |
... | ... | ||
52 | diff --git a/migration/migration.c b/migration/migration.c | 52 | diff --git a/migration/migration.c b/migration/migration.c |
53 | index XXXXXXX..XXXXXXX 100644 | 53 | index XXXXXXX..XXXXXXX 100644 |
54 | --- a/migration/migration.c | 54 | --- a/migration/migration.c |
55 | +++ b/migration/migration.c | 55 | +++ b/migration/migration.c |
56 | @@ -XXX,XX +XXX,XX @@ void migration_incoming_state_destroy(void) | 56 | @@ -XXX,XX +XXX,XX @@ void migration_incoming_state_destroy(void) |
57 | * RAM state cleanup needs to happen after multifd cleanup, because | 57 | * Check proper BQL state here rather than risk possible deadlock later. |
58 | * multifd threads can use some of its states (receivedmap). | ||
59 | */ | 58 | */ |
59 | assert(bql_locked()); | ||
60 | - qemu_loadvm_state_cleanup(); | 60 | - qemu_loadvm_state_cleanup(); |
61 | + qemu_loadvm_state_cleanup(mis); | 61 | + qemu_loadvm_state_cleanup(mis); |
62 | 62 | ||
63 | if (mis->to_src_file) { | 63 | if (mis->to_src_file) { |
64 | /* Tell source that we are done */ | 64 | /* Tell source that we are done */ |
... | ... | ||
149 | + MigrationIncomingState *mis = migration_incoming_get_current(); | 149 | + MigrationIncomingState *mis = migration_incoming_get_current(); |
150 | + g_autoptr(Error) local_err = NULL; | 150 | + g_autoptr(Error) local_err = NULL; |
151 | + | 151 | + |
152 | + if (!data->function(data->opaque, &mis->load_threads_abort, &local_err)) { | 152 | + if (!data->function(data->opaque, &mis->load_threads_abort, &local_err)) { |
153 | + MigrationState *s = migrate_get_current(); | 153 | + MigrationState *s = migrate_get_current(); |
154 | + | ||
155 | + /* | ||
156 | + * Can't set load_threads_abort here since processing of main migration | ||
157 | + * channel data could still be happening, resulting in launching of new | ||
158 | + * load threads. | ||
159 | + */ | ||
154 | + | 160 | + |
155 | + assert(local_err); | 161 | + assert(local_err); |
156 | + | 162 | + |
157 | + /* | 163 | + /* |
158 | + * In case of multiple load threads failing which thread error | 164 | + * In case of multiple load threads failing which thread error |
... | ... | diff view generated by jsdifflib |
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | 1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> |
---|---|---|---|
2 | 2 | ||
3 | Read packet header first so in the future we will be able to | 3 | Read packet header first so in the future we will be able to |
4 | differentiate between a RAM multifd packet and a device state multifd | 4 | differentiate between a RAM multifd packet and a device state multifd |
5 | packet. | 5 | packet. |
6 | 6 | ||
7 | Since these two are of different size we can't read the packet body until | 7 | Since these two are of different size we can't read the packet body until |
8 | we know which packet type it is. | 8 | we know which packet type it is. |
9 | 9 | ||
10 | Reviewed-by: Fabiano Rosas <farosas@suse.de> | 10 | Reviewed-by: Fabiano Rosas <farosas@suse.de> |
11 | Reviewed-by: Peter Xu <peterx@redhat.com> | 11 | Reviewed-by: Peter Xu <peterx@redhat.com> |
12 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | 12 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> |
13 | --- | 13 | --- |
14 | migration/multifd.c | 55 ++++++++++++++++++++++++++++++++++++--------- | 14 | migration/multifd.c | 55 ++++++++++++++++++++++++++++++++++++--------- |
15 | migration/multifd.h | 5 +++++ | 15 | migration/multifd.h | 5 +++++ |
16 | 2 files changed, 49 insertions(+), 11 deletions(-) | 16 | 2 files changed, 49 insertions(+), 11 deletions(-) |
17 | 17 | ||
18 | diff --git a/migration/multifd.c b/migration/multifd.c | 18 | diff --git a/migration/multifd.c b/migration/multifd.c |
19 | index XXXXXXX..XXXXXXX 100644 | 19 | index XXXXXXX..XXXXXXX 100644 |
20 | --- a/migration/multifd.c | 20 | --- a/migration/multifd.c |
21 | +++ b/migration/multifd.c | 21 | +++ b/migration/multifd.c |
22 | @@ -XXX,XX +XXX,XX @@ void multifd_send_fill_packet(MultiFDSendParams *p) | 22 | @@ -XXX,XX +XXX,XX @@ void multifd_send_fill_packet(MultiFDSendParams *p) |
23 | 23 | ||
24 | memset(packet, 0, p->packet_len); | 24 | memset(packet, 0, p->packet_len); |
25 | 25 | ||
26 | - packet->magic = cpu_to_be32(MULTIFD_MAGIC); | 26 | - packet->magic = cpu_to_be32(MULTIFD_MAGIC); |
27 | - packet->version = cpu_to_be32(MULTIFD_VERSION); | 27 | - packet->version = cpu_to_be32(MULTIFD_VERSION); |
28 | + packet->hdr.magic = cpu_to_be32(MULTIFD_MAGIC); | 28 | + packet->hdr.magic = cpu_to_be32(MULTIFD_MAGIC); |
29 | + packet->hdr.version = cpu_to_be32(MULTIFD_VERSION); | 29 | + packet->hdr.version = cpu_to_be32(MULTIFD_VERSION); |
30 | 30 | ||
31 | - packet->flags = cpu_to_be32(p->flags); | 31 | - packet->flags = cpu_to_be32(p->flags); |
32 | + packet->hdr.flags = cpu_to_be32(p->flags); | 32 | + packet->hdr.flags = cpu_to_be32(p->flags); |
33 | packet->next_packet_size = cpu_to_be32(p->next_packet_size); | 33 | packet->next_packet_size = cpu_to_be32(p->next_packet_size); |
34 | 34 | ||
35 | packet_num = qatomic_fetch_inc(&multifd_send_state->packet_num); | 35 | packet_num = qatomic_fetch_inc(&multifd_send_state->packet_num); |
36 | @@ -XXX,XX +XXX,XX @@ void multifd_send_fill_packet(MultiFDSendParams *p) | 36 | @@ -XXX,XX +XXX,XX @@ void multifd_send_fill_packet(MultiFDSendParams *p) |
37 | p->flags, p->next_packet_size); | 37 | p->flags, p->next_packet_size); |
38 | } | 38 | } |
39 | 39 | ||
40 | -static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp) | 40 | -static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp) |
41 | +static int multifd_recv_unfill_packet_header(MultiFDRecvParams *p, | 41 | +static int multifd_recv_unfill_packet_header(MultiFDRecvParams *p, |
42 | + const MultiFDPacketHdr_t *hdr, | 42 | + const MultiFDPacketHdr_t *hdr, |
43 | + Error **errp) | 43 | + Error **errp) |
44 | { | 44 | { |
45 | - const MultiFDPacket_t *packet = p->packet; | 45 | - const MultiFDPacket_t *packet = p->packet; |
46 | - uint32_t magic = be32_to_cpu(packet->magic); | 46 | - uint32_t magic = be32_to_cpu(packet->magic); |
47 | - uint32_t version = be32_to_cpu(packet->version); | 47 | - uint32_t version = be32_to_cpu(packet->version); |
48 | - int ret = 0; | 48 | - int ret = 0; |
49 | + uint32_t magic = be32_to_cpu(hdr->magic); | 49 | + uint32_t magic = be32_to_cpu(hdr->magic); |
50 | + uint32_t version = be32_to_cpu(hdr->version); | 50 | + uint32_t version = be32_to_cpu(hdr->version); |
51 | 51 | ||
52 | if (magic != MULTIFD_MAGIC) { | 52 | if (magic != MULTIFD_MAGIC) { |
53 | error_setg(errp, "multifd: received packet magic %x, expected %x", | 53 | error_setg(errp, "multifd: received packet magic %x, expected %x", |
54 | @@ -XXX,XX +XXX,XX @@ static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp) | 54 | @@ -XXX,XX +XXX,XX @@ static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp) |
55 | return -1; | 55 | return -1; |
56 | } | 56 | } |
57 | 57 | ||
58 | - p->flags = be32_to_cpu(packet->flags); | 58 | - p->flags = be32_to_cpu(packet->flags); |
59 | + p->flags = be32_to_cpu(hdr->flags); | 59 | + p->flags = be32_to_cpu(hdr->flags); |
60 | + | 60 | + |
61 | + return 0; | 61 | + return 0; |
62 | +} | 62 | +} |
63 | + | 63 | + |
64 | +static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp) | 64 | +static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp) |
65 | +{ | 65 | +{ |
66 | + const MultiFDPacket_t *packet = p->packet; | 66 | + const MultiFDPacket_t *packet = p->packet; |
67 | + int ret = 0; | 67 | + int ret = 0; |
68 | + | 68 | + |
69 | p->next_packet_size = be32_to_cpu(packet->next_packet_size); | 69 | p->next_packet_size = be32_to_cpu(packet->next_packet_size); |
70 | p->packet_num = be64_to_cpu(packet->packet_num); | 70 | p->packet_num = be64_to_cpu(packet->packet_num); |
71 | p->packets_recved++; | 71 | p->packets_recved++; |
72 | @@ -XXX,XX +XXX,XX @@ static void *multifd_recv_thread(void *opaque) | 72 | @@ -XXX,XX +XXX,XX @@ static void *multifd_recv_thread(void *opaque) |
73 | } | 73 | } |
74 | 74 | ||
75 | while (true) { | 75 | while (true) { |
76 | + MultiFDPacketHdr_t hdr; | 76 | + MultiFDPacketHdr_t hdr; |
77 | uint32_t flags = 0; | 77 | uint32_t flags = 0; |
78 | bool has_data = false; | 78 | bool has_data = false; |
79 | + uint8_t *pkt_buf; | 79 | + uint8_t *pkt_buf; |
80 | + size_t pkt_len; | 80 | + size_t pkt_len; |
81 | + | 81 | + |
82 | p->normal_num = 0; | 82 | p->normal_num = 0; |
83 | 83 | ||
84 | if (use_packets) { | 84 | if (use_packets) { |
85 | struct iovec iov = { | 85 | struct iovec iov = { |
86 | - .iov_base = (void *)p->packet, | 86 | - .iov_base = (void *)p->packet, |
87 | - .iov_len = p->packet_len | 87 | - .iov_len = p->packet_len |
88 | + .iov_base = (void *)&hdr, | 88 | + .iov_base = (void *)&hdr, |
89 | + .iov_len = sizeof(hdr) | 89 | + .iov_len = sizeof(hdr) |
90 | }; | 90 | }; |
91 | 91 | ||
92 | if (multifd_recv_should_exit()) { | 92 | if (multifd_recv_should_exit()) { |
93 | @@ -XXX,XX +XXX,XX @@ static void *multifd_recv_thread(void *opaque) | 93 | @@ -XXX,XX +XXX,XX @@ static void *multifd_recv_thread(void *opaque) |
94 | break; | 94 | break; |
95 | } | 95 | } |
96 | 96 | ||
97 | + ret = multifd_recv_unfill_packet_header(p, &hdr, &local_err); | 97 | + ret = multifd_recv_unfill_packet_header(p, &hdr, &local_err); |
98 | + if (ret) { | 98 | + if (ret) { |
99 | + break; | 99 | + break; |
100 | + } | 100 | + } |
101 | + | 101 | + |
102 | + pkt_buf = (uint8_t *)p->packet + sizeof(hdr); | 102 | + pkt_buf = (uint8_t *)p->packet + sizeof(hdr); |
103 | + pkt_len = p->packet_len - sizeof(hdr); | 103 | + pkt_len = p->packet_len - sizeof(hdr); |
104 | + | 104 | + |
105 | + ret = qio_channel_read_all_eof(p->c, (char *)pkt_buf, pkt_len, | 105 | + ret = qio_channel_read_all_eof(p->c, (char *)pkt_buf, pkt_len, |
106 | + &local_err); | 106 | + &local_err); |
107 | + if (!ret) { | 107 | + if (!ret) { |
108 | + /* EOF */ | 108 | + /* EOF */ |
109 | + error_setg(&local_err, "multifd: unexpected EOF after packet header"); | 109 | + error_setg(&local_err, "multifd: unexpected EOF after packet header"); |
110 | + break; | 110 | + break; |
111 | + } | 111 | + } |
112 | + | 112 | + |
113 | + if (ret == -1) { | 113 | + if (ret == -1) { |
114 | + break; | 114 | + break; |
115 | + } | 115 | + } |
116 | + | 116 | + |
117 | qemu_mutex_lock(&p->mutex); | 117 | qemu_mutex_lock(&p->mutex); |
118 | ret = multifd_recv_unfill_packet(p, &local_err); | 118 | ret = multifd_recv_unfill_packet(p, &local_err); |
119 | if (ret) { | 119 | if (ret) { |
120 | diff --git a/migration/multifd.h b/migration/multifd.h | 120 | diff --git a/migration/multifd.h b/migration/multifd.h |
121 | index XXXXXXX..XXXXXXX 100644 | 121 | index XXXXXXX..XXXXXXX 100644 |
122 | --- a/migration/multifd.h | 122 | --- a/migration/multifd.h |
123 | +++ b/migration/multifd.h | 123 | +++ b/migration/multifd.h |
124 | @@ -XXX,XX +XXX,XX @@ typedef struct { | 124 | @@ -XXX,XX +XXX,XX @@ typedef struct { |
125 | uint32_t magic; | 125 | uint32_t magic; |
126 | uint32_t version; | 126 | uint32_t version; |
127 | uint32_t flags; | 127 | uint32_t flags; |
128 | +} __attribute__((packed)) MultiFDPacketHdr_t; | 128 | +} __attribute__((packed)) MultiFDPacketHdr_t; |
129 | + | 129 | + |
130 | +typedef struct { | 130 | +typedef struct { |
131 | + MultiFDPacketHdr_t hdr; | 131 | + MultiFDPacketHdr_t hdr; |
132 | + | 132 | + |
133 | /* maximum number of allocated pages */ | 133 | /* maximum number of allocated pages */ |
134 | uint32_t pages_alloc; | 134 | uint32_t pages_alloc; |
135 | /* non zero pages */ | 135 | /* non zero pages */ | diff view generated by jsdifflib |
... | ... | ||
---|---|---|---|
12 | device's load_state_buffer handler. | 12 | device's load_state_buffer handler. |
13 | 13 | ||
14 | Reviewed-by: Peter Xu <peterx@redhat.com> | 14 | Reviewed-by: Peter Xu <peterx@redhat.com> |
15 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | 15 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> |
16 | --- | 16 | --- |
17 | migration/multifd.c | 99 ++++++++++++++++++++++++++++++++++++++++----- | 17 | migration/multifd.c | 101 +++++++++++++++++++++++++++++++++++++++----- |
18 | migration/multifd.h | 26 +++++++++++- | 18 | migration/multifd.h | 19 ++++++++- |
19 | 2 files changed, 113 insertions(+), 12 deletions(-) | 19 | 2 files changed, 108 insertions(+), 12 deletions(-) |
20 | 20 | ||
21 | diff --git a/migration/multifd.c b/migration/multifd.c | 21 | diff --git a/migration/multifd.c b/migration/multifd.c |
22 | index XXXXXXX..XXXXXXX 100644 | 22 | index XXXXXXX..XXXXXXX 100644 |
23 | --- a/migration/multifd.c | 23 | --- a/migration/multifd.c |
24 | +++ b/migration/multifd.c | 24 | +++ b/migration/multifd.c |
... | ... | ||
87 | trace_multifd_recv_sync_main(multifd_recv_state->packet_num); | 87 | trace_multifd_recv_sync_main(multifd_recv_state->packet_num); |
88 | } | 88 | } |
89 | 89 | ||
90 | +static int multifd_device_state_recv(MultiFDRecvParams *p, Error **errp) | 90 | +static int multifd_device_state_recv(MultiFDRecvParams *p, Error **errp) |
91 | +{ | 91 | +{ |
92 | + g_autofree char *idstr = NULL; | ||
93 | + g_autofree char *dev_state_buf = NULL; | 92 | + g_autofree char *dev_state_buf = NULL; |
94 | + int ret; | 93 | + int ret; |
95 | + | 94 | + |
96 | + dev_state_buf = g_malloc(p->next_packet_size); | 95 | + dev_state_buf = g_malloc(p->next_packet_size); |
97 | + | 96 | + |
98 | + ret = qio_channel_read_all(p->c, dev_state_buf, p->next_packet_size, errp); | 97 | + ret = qio_channel_read_all(p->c, dev_state_buf, p->next_packet_size, errp); |
99 | + if (ret != 0) { | 98 | + if (ret != 0) { |
100 | + return ret; | 99 | + return ret; |
101 | + } | 100 | + } |
102 | + | 101 | + |
103 | + idstr = g_strndup(p->packet_dev_state->idstr, | 102 | + if (p->packet_dev_state->idstr[sizeof(p->packet_dev_state->idstr) - 1] |
104 | + sizeof(p->packet_dev_state->idstr)); | 103 | + != 0) { |
105 | + | 104 | + error_setg(errp, "unterminated multifd device state idstr"); |
106 | + if (!qemu_loadvm_load_state_buffer(idstr, | 105 | + return -1; |
106 | + } | ||
107 | + | ||
108 | + if (!qemu_loadvm_load_state_buffer(p->packet_dev_state->idstr, | ||
107 | + p->packet_dev_state->instance_id, | 109 | + p->packet_dev_state->instance_id, |
108 | + dev_state_buf, p->next_packet_size, | 110 | + dev_state_buf, p->next_packet_size, |
109 | + errp)) { | 111 | + errp)) { |
110 | + ret = -1; | 112 | + ret = -1; |
111 | + } | 113 | + } |
... | ... | ||
226 | } __attribute__((packed)) MultiFDPacket_t; | 228 | } __attribute__((packed)) MultiFDPacket_t; |
227 | 229 | ||
228 | +typedef struct { | 230 | +typedef struct { |
229 | + MultiFDPacketHdr_t hdr; | 231 | + MultiFDPacketHdr_t hdr; |
230 | + | 232 | + |
231 | + char idstr[256] QEMU_NONSTRING; | 233 | + char idstr[256]; |
232 | + uint32_t instance_id; | 234 | + uint32_t instance_id; |
233 | + | 235 | + |
234 | + /* size of the next packet that contains the actual data */ | 236 | + /* size of the next packet that contains the actual data */ |
235 | + uint32_t next_packet_size; | 237 | + uint32_t next_packet_size; |
236 | +} __attribute__((packed)) MultiFDPacketDeviceState_t; | 238 | +} __attribute__((packed)) MultiFDPacketDeviceState_t; |
237 | + | 239 | + |
238 | typedef struct { | 240 | typedef struct { |
239 | /* number of used pages */ | 241 | /* number of used pages */ |
240 | uint32_t num; | 242 | uint32_t num; |
241 | @@ -XXX,XX +XXX,XX @@ struct MultiFDRecvData { | ||
242 | off_t file_offset; | ||
243 | }; | ||
244 | |||
245 | +typedef struct { | ||
246 | + char *idstr; | ||
247 | + uint32_t instance_id; | ||
248 | + char *buf; | ||
249 | + size_t buf_len; | ||
250 | +} MultiFDDeviceState_t; | ||
251 | + | ||
252 | typedef enum { | ||
253 | MULTIFD_PAYLOAD_NONE, | ||
254 | MULTIFD_PAYLOAD_RAM, | ||
255 | @@ -XXX,XX +XXX,XX @@ typedef struct { | 243 | @@ -XXX,XX +XXX,XX @@ typedef struct { |
256 | 244 | ||
257 | /* thread local variables. No locking required */ | 245 | /* thread local variables. No locking required */ |
258 | 246 | ||
259 | - /* pointer to the packet */ | 247 | - /* pointer to the packet */ |
260 | + /* pointers to the possible packet types */ | 248 | + /* pointers to the possible packet types */ |
261 | MultiFDPacket_t *packet; | 249 | MultiFDPacket_t *packet; |
262 | + MultiFDPacketDeviceState_t *packet_dev_state; | 250 | + MultiFDPacketDeviceState_t *packet_dev_state; |
263 | /* size of the next packet that contains pages */ | 251 | /* size of the next packet that contains pages */ |
264 | uint32_t next_packet_size; | 252 | uint32_t next_packet_size; |
265 | /* packets received through this channel */ | 253 | /* packets received through this channel */ | diff view generated by jsdifflib |
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | 1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> |
---|---|---|---|
2 | 2 | ||
3 | multifd_send() function is currently not thread safe, make it thread safe | 3 | multifd_send() function is currently not thread safe, make it thread safe |
4 | by holding a lock during its execution. | 4 | by holding a lock during its execution. |
5 | 5 | ||
6 | This way it will be possible to safely call it concurrently from multiple | 6 | This way it will be possible to safely call it concurrently from multiple |
7 | threads. | 7 | threads. |
8 | 8 | ||
9 | Reviewed-by: Peter Xu <peterx@redhat.com> | 9 | Reviewed-by: Peter Xu <peterx@redhat.com> |
10 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | 10 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> |
11 | --- | 11 | --- |
12 | migration/multifd.c | 8 ++++++++ | 12 | migration/multifd.c | 8 ++++++++ |
13 | 1 file changed, 8 insertions(+) | 13 | 1 file changed, 8 insertions(+) |
14 | 14 | ||
15 | diff --git a/migration/multifd.c b/migration/multifd.c | 15 | diff --git a/migration/multifd.c b/migration/multifd.c |
16 | index XXXXXXX..XXXXXXX 100644 | 16 | index XXXXXXX..XXXXXXX 100644 |
17 | --- a/migration/multifd.c | 17 | --- a/migration/multifd.c |
18 | +++ b/migration/multifd.c | 18 | +++ b/migration/multifd.c |
19 | @@ -XXX,XX +XXX,XX @@ typedef struct { | 19 | @@ -XXX,XX +XXX,XX @@ typedef struct { |
20 | 20 | ||
21 | struct { | 21 | struct { |
22 | MultiFDSendParams *params; | 22 | MultiFDSendParams *params; |
23 | + | 23 | + |
24 | + /* multifd_send() body is not thread safe, needs serialization */ | 24 | + /* multifd_send() body is not thread safe, needs serialization */ |
25 | + QemuMutex multifd_send_mutex; | 25 | + QemuMutex multifd_send_mutex; |
26 | + | 26 | + |
27 | /* | 27 | /* |
28 | * Global number of generated multifd packets. | 28 | * Global number of generated multifd packets. |
29 | * | 29 | * |
30 | @@ -XXX,XX +XXX,XX @@ bool multifd_send(MultiFDSendData **send_data) | 30 | @@ -XXX,XX +XXX,XX @@ bool multifd_send(MultiFDSendData **send_data) |
31 | return false; | 31 | return false; |
32 | } | 32 | } |
33 | 33 | ||
34 | + QEMU_LOCK_GUARD(&multifd_send_state->multifd_send_mutex); | 34 | + QEMU_LOCK_GUARD(&multifd_send_state->multifd_send_mutex); |
35 | + | 35 | + |
36 | /* We wait here, until at least one channel is ready */ | 36 | /* We wait here, until at least one channel is ready */ |
37 | qemu_sem_wait(&multifd_send_state->channels_ready); | 37 | qemu_sem_wait(&multifd_send_state->channels_ready); |
38 | 38 | ||
39 | @@ -XXX,XX +XXX,XX @@ static void multifd_send_cleanup_state(void) | 39 | @@ -XXX,XX +XXX,XX @@ static void multifd_send_cleanup_state(void) |
40 | socket_cleanup_outgoing_migration(); | 40 | socket_cleanup_outgoing_migration(); |
41 | qemu_sem_destroy(&multifd_send_state->channels_created); | 41 | qemu_sem_destroy(&multifd_send_state->channels_created); |
42 | qemu_sem_destroy(&multifd_send_state->channels_ready); | 42 | qemu_sem_destroy(&multifd_send_state->channels_ready); |
43 | + qemu_mutex_destroy(&multifd_send_state->multifd_send_mutex); | 43 | + qemu_mutex_destroy(&multifd_send_state->multifd_send_mutex); |
44 | g_free(multifd_send_state->params); | 44 | g_free(multifd_send_state->params); |
45 | multifd_send_state->params = NULL; | 45 | multifd_send_state->params = NULL; |
46 | g_free(multifd_send_state); | 46 | g_free(multifd_send_state); |
47 | @@ -XXX,XX +XXX,XX @@ bool multifd_send_setup(void) | 47 | @@ -XXX,XX +XXX,XX @@ bool multifd_send_setup(void) |
48 | thread_count = migrate_multifd_channels(); | 48 | thread_count = migrate_multifd_channels(); |
49 | multifd_send_state = g_malloc0(sizeof(*multifd_send_state)); | 49 | multifd_send_state = g_malloc0(sizeof(*multifd_send_state)); |
50 | multifd_send_state->params = g_new0(MultiFDSendParams, thread_count); | 50 | multifd_send_state->params = g_new0(MultiFDSendParams, thread_count); |
51 | + qemu_mutex_init(&multifd_send_state->multifd_send_mutex); | 51 | + qemu_mutex_init(&multifd_send_state->multifd_send_mutex); |
52 | qemu_sem_init(&multifd_send_state->channels_created, 0); | 52 | qemu_sem_init(&multifd_send_state->channels_created, 0); |
53 | qemu_sem_init(&multifd_send_state->channels_ready, 0); | 53 | qemu_sem_init(&multifd_send_state->channels_ready, 0); |
54 | qatomic_set(&multifd_send_state->exiting, 0); | 54 | qatomic_set(&multifd_send_state->exiting, 0); | diff view generated by jsdifflib |
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | 1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> |
---|---|---|---|
2 | 2 | ||
3 | This way if there are fields there that needs explicit disposal (like, for | 3 | This way if there are fields there that needs explicit disposal (like, for |
4 | example, some attached buffers) they will be handled appropriately. | 4 | example, some attached buffers) they will be handled appropriately. |
5 | 5 | ||
6 | Add a related assert to multifd_set_payload_type() in order to make sure | 6 | Add a related assert to multifd_set_payload_type() in order to make sure |
7 | that this function is only used to fill a previously empty MultiFDSendData | 7 | that this function is only used to fill a previously empty MultiFDSendData |
8 | with some payload, not the other way around. | 8 | with some payload, not the other way around. |
9 | 9 | ||
10 | Reviewed-by: Fabiano Rosas <farosas@suse.de> | 10 | Reviewed-by: Fabiano Rosas <farosas@suse.de> |
11 | Reviewed-by: Peter Xu <peterx@redhat.com> | 11 | Reviewed-by: Peter Xu <peterx@redhat.com> |
12 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | 12 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> |
13 | --- | 13 | --- |
14 | migration/multifd-nocomp.c | 3 +-- | 14 | migration/multifd-nocomp.c | 3 +-- |
15 | migration/multifd.c | 31 ++++++++++++++++++++++++++++--- | 15 | migration/multifd.c | 31 ++++++++++++++++++++++++++++--- |
16 | migration/multifd.h | 5 +++++ | 16 | migration/multifd.h | 5 +++++ |
17 | 3 files changed, 34 insertions(+), 5 deletions(-) | 17 | 3 files changed, 34 insertions(+), 5 deletions(-) |
18 | 18 | ||
19 | diff --git a/migration/multifd-nocomp.c b/migration/multifd-nocomp.c | 19 | diff --git a/migration/multifd-nocomp.c b/migration/multifd-nocomp.c |
20 | index XXXXXXX..XXXXXXX 100644 | 20 | index XXXXXXX..XXXXXXX 100644 |
21 | --- a/migration/multifd-nocomp.c | 21 | --- a/migration/multifd-nocomp.c |
22 | +++ b/migration/multifd-nocomp.c | 22 | +++ b/migration/multifd-nocomp.c |
23 | @@ -XXX,XX +XXX,XX @@ void multifd_ram_save_setup(void) | 23 | @@ -XXX,XX +XXX,XX @@ void multifd_ram_save_setup(void) |
24 | 24 | ||
25 | void multifd_ram_save_cleanup(void) | 25 | void multifd_ram_save_cleanup(void) |
26 | { | 26 | { |
27 | - g_free(multifd_ram_send); | 27 | - g_free(multifd_ram_send); |
28 | - multifd_ram_send = NULL; | 28 | - multifd_ram_send = NULL; |
29 | + g_clear_pointer(&multifd_ram_send, multifd_send_data_free); | 29 | + g_clear_pointer(&multifd_ram_send, multifd_send_data_free); |
30 | } | 30 | } |
31 | 31 | ||
32 | static void multifd_set_file_bitmap(MultiFDSendParams *p) | 32 | static void multifd_set_file_bitmap(MultiFDSendParams *p) |
33 | diff --git a/migration/multifd.c b/migration/multifd.c | 33 | diff --git a/migration/multifd.c b/migration/multifd.c |
34 | index XXXXXXX..XXXXXXX 100644 | 34 | index XXXXXXX..XXXXXXX 100644 |
35 | --- a/migration/multifd.c | 35 | --- a/migration/multifd.c |
36 | +++ b/migration/multifd.c | 36 | +++ b/migration/multifd.c |
37 | @@ -XXX,XX +XXX,XX @@ MultiFDSendData *multifd_send_data_alloc(void) | 37 | @@ -XXX,XX +XXX,XX @@ MultiFDSendData *multifd_send_data_alloc(void) |
38 | return g_malloc0(size_minus_payload + max_payload_size); | 38 | return g_malloc0(size_minus_payload + max_payload_size); |
39 | } | 39 | } |
40 | 40 | ||
41 | +void multifd_send_data_clear(MultiFDSendData *data) | 41 | +void multifd_send_data_clear(MultiFDSendData *data) |
42 | +{ | 42 | +{ |
43 | + if (multifd_payload_empty(data)) { | 43 | + if (multifd_payload_empty(data)) { |
44 | + return; | 44 | + return; |
45 | + } | 45 | + } |
46 | + | 46 | + |
47 | + switch (data->type) { | 47 | + switch (data->type) { |
48 | + default: | 48 | + default: |
49 | + /* Nothing to do */ | 49 | + /* Nothing to do */ |
50 | + break; | 50 | + break; |
51 | + } | 51 | + } |
52 | + | 52 | + |
53 | + data->type = MULTIFD_PAYLOAD_NONE; | 53 | + data->type = MULTIFD_PAYLOAD_NONE; |
54 | +} | 54 | +} |
55 | + | 55 | + |
56 | +void multifd_send_data_free(MultiFDSendData *data) | 56 | +void multifd_send_data_free(MultiFDSendData *data) |
57 | +{ | 57 | +{ |
58 | + if (!data) { | 58 | + if (!data) { |
59 | + return; | 59 | + return; |
60 | + } | 60 | + } |
61 | + | 61 | + |
62 | + multifd_send_data_clear(data); | 62 | + multifd_send_data_clear(data); |
63 | + | 63 | + |
64 | + g_free(data); | 64 | + g_free(data); |
65 | +} | 65 | +} |
66 | + | 66 | + |
67 | static bool multifd_use_packets(void) | 67 | static bool multifd_use_packets(void) |
68 | { | 68 | { |
69 | return !migrate_mapped_ram(); | 69 | return !migrate_mapped_ram(); |
70 | @@ -XXX,XX +XXX,XX @@ static bool multifd_send_cleanup_channel(MultiFDSendParams *p, Error **errp) | 70 | @@ -XXX,XX +XXX,XX @@ static bool multifd_send_cleanup_channel(MultiFDSendParams *p, Error **errp) |
71 | qemu_sem_destroy(&p->sem_sync); | 71 | qemu_sem_destroy(&p->sem_sync); |
72 | g_free(p->name); | 72 | g_free(p->name); |
73 | p->name = NULL; | 73 | p->name = NULL; |
74 | - g_free(p->data); | 74 | - g_free(p->data); |
75 | - p->data = NULL; | 75 | - p->data = NULL; |
76 | + g_clear_pointer(&p->data, multifd_send_data_free); | 76 | + g_clear_pointer(&p->data, multifd_send_data_free); |
77 | p->packet_len = 0; | 77 | p->packet_len = 0; |
78 | g_free(p->packet); | 78 | g_free(p->packet); |
79 | p->packet = NULL; | 79 | p->packet = NULL; |
80 | @@ -XXX,XX +XXX,XX @@ static void *multifd_send_thread(void *opaque) | 80 | @@ -XXX,XX +XXX,XX @@ static void *multifd_send_thread(void *opaque) |
81 | (uint64_t)p->next_packet_size + p->packet_len); | 81 | (uint64_t)p->next_packet_size + p->packet_len); |
82 | 82 | ||
83 | p->next_packet_size = 0; | 83 | p->next_packet_size = 0; |
84 | - multifd_set_payload_type(p->data, MULTIFD_PAYLOAD_NONE); | 84 | - multifd_set_payload_type(p->data, MULTIFD_PAYLOAD_NONE); |
85 | + multifd_send_data_clear(p->data); | 85 | + multifd_send_data_clear(p->data); |
86 | 86 | ||
87 | /* | 87 | /* |
88 | * Making sure p->data is published before saying "we're | 88 | * Making sure p->data is published before saying "we're |
89 | diff --git a/migration/multifd.h b/migration/multifd.h | 89 | diff --git a/migration/multifd.h b/migration/multifd.h |
90 | index XXXXXXX..XXXXXXX 100644 | 90 | index XXXXXXX..XXXXXXX 100644 |
91 | --- a/migration/multifd.h | 91 | --- a/migration/multifd.h |
92 | +++ b/migration/multifd.h | 92 | +++ b/migration/multifd.h |
93 | @@ -XXX,XX +XXX,XX @@ static inline bool multifd_payload_empty(MultiFDSendData *data) | 93 | @@ -XXX,XX +XXX,XX @@ static inline bool multifd_payload_empty(MultiFDSendData *data) |
94 | static inline void multifd_set_payload_type(MultiFDSendData *data, | 94 | static inline void multifd_set_payload_type(MultiFDSendData *data, |
95 | MultiFDPayloadType type) | 95 | MultiFDPayloadType type) |
96 | { | 96 | { |
97 | + assert(multifd_payload_empty(data)); | 97 | + assert(multifd_payload_empty(data)); |
98 | + assert(type != MULTIFD_PAYLOAD_NONE); | 98 | + assert(type != MULTIFD_PAYLOAD_NONE); |
99 | + | 99 | + |
100 | data->type = type; | 100 | data->type = type; |
101 | } | 101 | } |
102 | 102 | ||
103 | @@ -XXX,XX +XXX,XX @@ static inline void multifd_send_prepare_header(MultiFDSendParams *p) | 103 | @@ -XXX,XX +XXX,XX @@ static inline void multifd_send_prepare_header(MultiFDSendParams *p) |
104 | void multifd_channel_connect(MultiFDSendParams *p, QIOChannel *ioc); | 104 | void multifd_channel_connect(MultiFDSendParams *p, QIOChannel *ioc); |
105 | bool multifd_send(MultiFDSendData **send_data); | 105 | bool multifd_send(MultiFDSendData **send_data); |
106 | MultiFDSendData *multifd_send_data_alloc(void); | 106 | MultiFDSendData *multifd_send_data_alloc(void); |
107 | +void multifd_send_data_clear(MultiFDSendData *data); | 107 | +void multifd_send_data_clear(MultiFDSendData *data); |
108 | +void multifd_send_data_free(MultiFDSendData *data); | 108 | +void multifd_send_data_free(MultiFDSendData *data); |
109 | 109 | ||
110 | static inline uint32_t multifd_ram_page_size(void) | 110 | static inline uint32_t multifd_ram_page_size(void) |
111 | { | 111 | { | diff view generated by jsdifflib |
... | ... | ||
---|---|---|---|
6 | Reviewed-by: Peter Xu <peterx@redhat.com> | 6 | Reviewed-by: Peter Xu <peterx@redhat.com> |
7 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | 7 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> |
8 | --- | 8 | --- |
9 | include/migration/misc.h | 4 ++ | 9 | include/migration/misc.h | 4 ++ |
10 | migration/meson.build | 1 + | 10 | migration/meson.build | 1 + |
11 | migration/multifd-device-state.c | 115 +++++++++++++++++++++++++++++++ | 11 | migration/multifd-device-state.c | 118 +++++++++++++++++++++++++++++++ |
12 | migration/multifd-nocomp.c | 14 +++- | 12 | migration/multifd-nocomp.c | 14 +++- |
13 | migration/multifd.c | 42 +++++++++-- | 13 | migration/multifd.c | 42 +++++++++-- |
14 | migration/multifd.h | 27 +++++--- | 14 | migration/multifd.h | 34 ++++++--- |
15 | 6 files changed, 187 insertions(+), 16 deletions(-) | 15 | 6 files changed, 197 insertions(+), 16 deletions(-) |
16 | create mode 100644 migration/multifd-device-state.c | 16 | create mode 100644 migration/multifd-device-state.c |
17 | 17 | ||
18 | diff --git a/include/migration/misc.h b/include/migration/misc.h | 18 | diff --git a/include/migration/misc.h b/include/migration/misc.h |
19 | index XXXXXXX..XXXXXXX 100644 | 19 | index XXXXXXX..XXXXXXX 100644 |
20 | --- a/include/migration/misc.h | 20 | --- a/include/migration/misc.h |
... | ... | ||
51 | + * | 51 | + * |
52 | + * Copyright (C) 2024,2025 Oracle and/or its affiliates. | 52 | + * Copyright (C) 2024,2025 Oracle and/or its affiliates. |
53 | + * | 53 | + * |
54 | + * This work is licensed under the terms of the GNU GPL, version 2 or later. | 54 | + * This work is licensed under the terms of the GNU GPL, version 2 or later. |
55 | + * See the COPYING file in the top-level directory. | 55 | + * See the COPYING file in the top-level directory. |
56 | + * | ||
57 | + * SPDX-License-Identifier: GPL-2.0-or-later | ||
56 | + */ | 58 | + */ |
57 | + | 59 | + |
58 | +#include "qemu/osdep.h" | 60 | +#include "qemu/osdep.h" |
59 | +#include "qemu/lockable.h" | 61 | +#include "qemu/lockable.h" |
60 | +#include "migration/misc.h" | 62 | +#include "migration/misc.h" |
... | ... | ||
101 | +{ | 103 | +{ |
102 | + MultiFDDeviceState_t *device_state = &p->data->u.device_state; | 104 | + MultiFDDeviceState_t *device_state = &p->data->u.device_state; |
103 | + MultiFDPacketDeviceState_t *packet = p->packet_device_state; | 105 | + MultiFDPacketDeviceState_t *packet = p->packet_device_state; |
104 | + | 106 | + |
105 | + packet->hdr.flags = cpu_to_be32(p->flags); | 107 | + packet->hdr.flags = cpu_to_be32(p->flags); |
106 | + strncpy(packet->idstr, device_state->idstr, sizeof(packet->idstr)); | 108 | + strncpy(packet->idstr, device_state->idstr, sizeof(packet->idstr) - 1); |
109 | + packet->idstr[sizeof(packet->idstr) - 1] = 0; | ||
107 | + packet->instance_id = cpu_to_be32(device_state->instance_id); | 110 | + packet->instance_id = cpu_to_be32(device_state->instance_id); |
108 | + packet->next_packet_size = cpu_to_be32(p->next_packet_size); | 111 | + packet->next_packet_size = cpu_to_be32(p->next_packet_size); |
109 | +} | 112 | +} |
110 | + | 113 | + |
111 | +static void multifd_prepare_header_device_state(MultiFDSendParams *p) | 114 | +static void multifd_prepare_header_device_state(MultiFDSendParams *p) |
... | ... | ||
346 | err: | 349 | err: |
347 | diff --git a/migration/multifd.h b/migration/multifd.h | 350 | diff --git a/migration/multifd.h b/migration/multifd.h |
348 | index XXXXXXX..XXXXXXX 100644 | 351 | index XXXXXXX..XXXXXXX 100644 |
349 | --- a/migration/multifd.h | 352 | --- a/migration/multifd.h |
350 | +++ b/migration/multifd.h | 353 | +++ b/migration/multifd.h |
351 | @@ -XXX,XX +XXX,XX @@ typedef struct { | 354 | @@ -XXX,XX +XXX,XX @@ struct MultiFDRecvData { |
355 | off_t file_offset; | ||
356 | }; | ||
357 | |||
358 | +typedef struct { | ||
359 | + char *idstr; | ||
360 | + uint32_t instance_id; | ||
361 | + char *buf; | ||
362 | + size_t buf_len; | ||
363 | +} MultiFDDeviceState_t; | ||
364 | + | ||
352 | typedef enum { | 365 | typedef enum { |
353 | MULTIFD_PAYLOAD_NONE, | 366 | MULTIFD_PAYLOAD_NONE, |
354 | MULTIFD_PAYLOAD_RAM, | 367 | MULTIFD_PAYLOAD_RAM, |
355 | + MULTIFD_PAYLOAD_DEVICE_STATE, | 368 | + MULTIFD_PAYLOAD_DEVICE_STATE, |
356 | } MultiFDPayloadType; | 369 | } MultiFDPayloadType; |
... | ... | diff view generated by jsdifflib |
... | ... | ||
---|---|---|---|
89 | - * (MultiFDPages_t + flex array). | 89 | - * (MultiFDPages_t + flex array). |
90 | - */ | 90 | - */ |
91 | - max_payload_size = MAX(multifd_ram_payload_size(), | 91 | - max_payload_size = MAX(multifd_ram_payload_size(), |
92 | - multifd_device_state_payload_size()); | 92 | - multifd_device_state_payload_size()); |
93 | - max_payload_size = MAX(max_payload_size, sizeof(MultiFDPayload)); | 93 | - max_payload_size = MAX(max_payload_size, sizeof(MultiFDPayload)); |
94 | - | 94 | + multifd_ram_payload_alloc(&new->u.ram); |
95 | + /* Device state allocates its payload on-demand */ | ||
96 | |||
95 | - /* | 97 | - /* |
96 | - * Account for any holes the compiler might insert. We can't pack | 98 | - * Account for any holes the compiler might insert. We can't pack |
97 | - * the structure because that misaligns the members and triggers | 99 | - * the structure because that misaligns the members and triggers |
98 | - * Waddress-of-packed-member. | 100 | - * Waddress-of-packed-member. |
99 | - */ | 101 | - */ |
100 | - size_minus_payload = sizeof(MultiFDSendData) - sizeof(MultiFDPayload); | 102 | - size_minus_payload = sizeof(MultiFDSendData) - sizeof(MultiFDPayload); |
101 | + multifd_ram_payload_alloc(&new->u.ram); | 103 | - |
102 | + /* Device state allocates its payload on-demand */ | ||
103 | |||
104 | - return g_malloc0(size_minus_payload + max_payload_size); | 104 | - return g_malloc0(size_minus_payload + max_payload_size); |
105 | + return new; | 105 | + return new; |
106 | } | 106 | } |
107 | 107 | ||
108 | void multifd_send_data_clear(MultiFDSendData *data) | 108 | void multifd_send_data_clear(MultiFDSendData *data) |
... | ... | diff view generated by jsdifflib |
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | 1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> |
---|---|---|---|
2 | 2 | ||
3 | Since device state transfer via multifd channels requires multifd | 3 | Since device state transfer via multifd channels requires multifd |
4 | channels with packets and is currently not compatible with multifd | 4 | channels with packets and is currently not compatible with multifd |
5 | compression add an appropriate query function so device can learn | 5 | compression add an appropriate query function so device can learn |
6 | whether it can actually make use of it. | 6 | whether it can actually make use of it. |
7 | 7 | ||
8 | Reviewed-by: Fabiano Rosas <farosas@suse.de> | 8 | Reviewed-by: Fabiano Rosas <farosas@suse.de> |
9 | Reviewed-by: Peter Xu <peterx@redhat.com> | 9 | Reviewed-by: Peter Xu <peterx@redhat.com> |
10 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | 10 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> |
11 | --- | 11 | --- |
12 | include/migration/misc.h | 1 + | 12 | include/migration/misc.h | 1 + |
13 | migration/multifd-device-state.c | 7 +++++++ | 13 | migration/multifd-device-state.c | 7 +++++++ |
14 | 2 files changed, 8 insertions(+) | 14 | 2 files changed, 8 insertions(+) |
15 | 15 | ||
16 | diff --git a/include/migration/misc.h b/include/migration/misc.h | 16 | diff --git a/include/migration/misc.h b/include/migration/misc.h |
17 | index XXXXXXX..XXXXXXX 100644 | 17 | index XXXXXXX..XXXXXXX 100644 |
18 | --- a/include/migration/misc.h | 18 | --- a/include/migration/misc.h |
19 | +++ b/include/migration/misc.h | 19 | +++ b/include/migration/misc.h |
20 | @@ -XXX,XX +XXX,XX @@ bool migrate_uri_parse(const char *uri, MigrationChannel **channel, | 20 | @@ -XXX,XX +XXX,XX @@ bool migrate_uri_parse(const char *uri, MigrationChannel **channel, |
21 | /* migration/multifd-device-state.c */ | 21 | /* migration/multifd-device-state.c */ |
22 | bool multifd_queue_device_state(char *idstr, uint32_t instance_id, | 22 | bool multifd_queue_device_state(char *idstr, uint32_t instance_id, |
23 | char *data, size_t len); | 23 | char *data, size_t len); |
24 | +bool multifd_device_state_supported(void); | 24 | +bool multifd_device_state_supported(void); |
25 | 25 | ||
26 | #endif | 26 | #endif |
27 | diff --git a/migration/multifd-device-state.c b/migration/multifd-device-state.c | 27 | diff --git a/migration/multifd-device-state.c b/migration/multifd-device-state.c |
28 | index XXXXXXX..XXXXXXX 100644 | 28 | index XXXXXXX..XXXXXXX 100644 |
29 | --- a/migration/multifd-device-state.c | 29 | --- a/migration/multifd-device-state.c |
30 | +++ b/migration/multifd-device-state.c | 30 | +++ b/migration/multifd-device-state.c |
31 | @@ -XXX,XX +XXX,XX @@ | 31 | @@ -XXX,XX +XXX,XX @@ |
32 | #include "qemu/lockable.h" | 32 | #include "qemu/lockable.h" |
33 | #include "migration/misc.h" | 33 | #include "migration/misc.h" |
34 | #include "multifd.h" | 34 | #include "multifd.h" |
35 | +#include "options.h" | 35 | +#include "options.h" |
36 | 36 | ||
37 | static struct { | 37 | static struct { |
38 | QemuMutex queue_job_mutex; | 38 | QemuMutex queue_job_mutex; |
39 | @@ -XXX,XX +XXX,XX @@ bool multifd_queue_device_state(char *idstr, uint32_t instance_id, | 39 | @@ -XXX,XX +XXX,XX @@ bool multifd_queue_device_state(char *idstr, uint32_t instance_id, |
40 | 40 | ||
41 | return true; | 41 | return true; |
42 | } | 42 | } |
43 | + | 43 | + |
44 | +bool multifd_device_state_supported(void) | 44 | +bool multifd_device_state_supported(void) |
45 | +{ | 45 | +{ |
46 | + return migrate_multifd() && !migrate_mapped_ram() && | 46 | + return migrate_multifd() && !migrate_mapped_ram() && |
47 | + migrate_multifd_compression() == MULTIFD_COMPRESSION_NONE; | 47 | + migrate_multifd_compression() == MULTIFD_COMPRESSION_NONE; |
48 | +} | 48 | +} | diff view generated by jsdifflib |
... | ... | ||
---|---|---|---|
10 | Management of these threads in done in the multifd migration code, | 10 | Management of these threads in done in the multifd migration code, |
11 | wrapping them in the generic thread pool. | 11 | wrapping them in the generic thread pool. |
12 | 12 | ||
13 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | 13 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> |
14 | --- | 14 | --- |
15 | include/migration/misc.h | 17 +++++++ | 15 | include/migration/misc.h | 17 ++++++ |
16 | include/migration/register.h | 19 +++++++ | 16 | include/migration/register.h | 19 +++++++ |
17 | include/qemu/typedefs.h | 3 ++ | 17 | include/qemu/typedefs.h | 3 ++ |
18 | migration/multifd-device-state.c | 85 ++++++++++++++++++++++++++++++++ | 18 | migration/multifd-device-state.c | 92 ++++++++++++++++++++++++++++++++ |
19 | migration/savevm.c | 35 ++++++++++++- | 19 | migration/savevm.c | 40 +++++++++++++- |
20 | 5 files changed, 158 insertions(+), 1 deletion(-) | 20 | 5 files changed, 170 insertions(+), 1 deletion(-) |
21 | 21 | ||
22 | diff --git a/include/migration/misc.h b/include/migration/misc.h | 22 | diff --git a/include/migration/misc.h b/include/migration/misc.h |
23 | index XXXXXXX..XXXXXXX 100644 | 23 | index XXXXXXX..XXXXXXX 100644 |
24 | --- a/include/migration/misc.h | 24 | --- a/include/migration/misc.h |
25 | +++ b/include/migration/misc.h | 25 | +++ b/include/migration/misc.h |
... | ... | ||
157 | + SaveLiveCompletePrecopyThreadData *data = opaque; | 157 | + SaveLiveCompletePrecopyThreadData *data = opaque; |
158 | + g_autoptr(Error) local_err = NULL; | 158 | + g_autoptr(Error) local_err = NULL; |
159 | + | 159 | + |
160 | + if (!data->hdlr(data, &local_err)) { | 160 | + if (!data->hdlr(data, &local_err)) { |
161 | + MigrationState *s = migrate_get_current(); | 161 | + MigrationState *s = migrate_get_current(); |
162 | + | ||
163 | + /* | ||
164 | + * Can't call abort_device_state_save_threads() here since new | ||
165 | + * save threads could still be in process of being launched | ||
166 | + * (if, for example, the very first save thread launched exited | ||
167 | + * with an error very quickly). | ||
168 | + */ | ||
162 | + | 169 | + |
163 | + assert(local_err); | 170 | + assert(local_err); |
164 | + | 171 | + |
165 | + /* | 172 | + /* |
166 | + * In case of multiple save threads failing which thread error | 173 | + * In case of multiple save threads failing which thread error |
... | ... | ||
265 | end_ts_each = qemu_clock_get_us(QEMU_CLOCK_REALTIME); | 272 | end_ts_each = qemu_clock_get_us(QEMU_CLOCK_REALTIME); |
266 | trace_vmstate_downtime_save("iterable", se->idstr, se->instance_id, | 273 | trace_vmstate_downtime_save("iterable", se->idstr, se->instance_id, |
267 | end_ts_each - start_ts_each); | 274 | end_ts_each - start_ts_each); |
268 | } | 275 | } |
269 | 276 | ||
270 | + if (multifd_device_state && | 277 | + if (multifd_device_state) { |
271 | + !multifd_join_device_state_save_threads()) { | 278 | + if (migrate_has_error(migrate_get_current())) { |
272 | + qemu_file_set_error(f, -EINVAL); | 279 | + multifd_abort_device_state_save_threads(); |
273 | + return -1; | 280 | + } |
281 | + | ||
282 | + if (!multifd_join_device_state_save_threads()) { | ||
283 | + qemu_file_set_error(f, -EINVAL); | ||
284 | + return -1; | ||
285 | + } | ||
274 | + } | 286 | + } |
275 | + | 287 | + |
276 | trace_vmstate_downtime_checkpoint("src-iterable-saved"); | 288 | trace_vmstate_downtime_checkpoint("src-iterable-saved"); |
277 | 289 | ||
278 | return 0; | 290 | return 0; |
... | ... | diff view generated by jsdifflib |
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | 1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> |
---|---|---|---|
2 | 2 | ||
3 | And rename existing load_device_config_state trace event to | 3 | And rename existing load_device_config_state trace event to |
4 | load_device_config_state_end for consistency since it is triggered at the | 4 | load_device_config_state_end for consistency since it is triggered at the |
5 | end of loading of the VFIO device config state. | 5 | end of loading of the VFIO device config state. |
6 | 6 | ||
7 | This way both the start and end points of particular device config | 7 | This way both the start and end points of particular device config |
8 | loading operation (a long, BQL-serialized operation) are known. | 8 | loading operation (a long, BQL-serialized operation) are known. |
9 | 9 | ||
10 | Reviewed-by: Cédric Le Goater <clg@redhat.com> | 10 | Reviewed-by: Cédric Le Goater <clg@redhat.com> |
11 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | 11 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> |
12 | --- | 12 | --- |
13 | hw/vfio/migration.c | 4 +++- | 13 | hw/vfio/migration.c | 4 +++- |
14 | hw/vfio/trace-events | 3 ++- | 14 | hw/vfio/trace-events | 3 ++- |
15 | 2 files changed, 5 insertions(+), 2 deletions(-) | 15 | 2 files changed, 5 insertions(+), 2 deletions(-) |
16 | 16 | ||
17 | diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c | 17 | diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c |
18 | index XXXXXXX..XXXXXXX 100644 | 18 | index XXXXXXX..XXXXXXX 100644 |
19 | --- a/hw/vfio/migration.c | 19 | --- a/hw/vfio/migration.c |
20 | +++ b/hw/vfio/migration.c | 20 | +++ b/hw/vfio/migration.c |
21 | @@ -XXX,XX +XXX,XX @@ static int vfio_load_device_config_state(QEMUFile *f, void *opaque) | 21 | @@ -XXX,XX +XXX,XX @@ static int vfio_load_device_config_state(QEMUFile *f, void *opaque) |
22 | VFIODevice *vbasedev = opaque; | 22 | VFIODevice *vbasedev = opaque; |
23 | uint64_t data; | 23 | uint64_t data; |
24 | 24 | ||
25 | + trace_vfio_load_device_config_state_start(vbasedev->name); | 25 | + trace_vfio_load_device_config_state_start(vbasedev->name); |
26 | + | 26 | + |
27 | if (vbasedev->ops && vbasedev->ops->vfio_load_config) { | 27 | if (vbasedev->ops && vbasedev->ops->vfio_load_config) { |
28 | int ret; | 28 | int ret; |
29 | 29 | ||
30 | @@ -XXX,XX +XXX,XX @@ static int vfio_load_device_config_state(QEMUFile *f, void *opaque) | 30 | @@ -XXX,XX +XXX,XX @@ static int vfio_load_device_config_state(QEMUFile *f, void *opaque) |
31 | return -EINVAL; | 31 | return -EINVAL; |
32 | } | 32 | } |
33 | 33 | ||
34 | - trace_vfio_load_device_config_state(vbasedev->name); | 34 | - trace_vfio_load_device_config_state(vbasedev->name); |
35 | + trace_vfio_load_device_config_state_end(vbasedev->name); | 35 | + trace_vfio_load_device_config_state_end(vbasedev->name); |
36 | return qemu_file_get_error(f); | 36 | return qemu_file_get_error(f); |
37 | } | 37 | } |
38 | 38 | ||
39 | diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events | 39 | diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events |
40 | index XXXXXXX..XXXXXXX 100644 | 40 | index XXXXXXX..XXXXXXX 100644 |
41 | --- a/hw/vfio/trace-events | 41 | --- a/hw/vfio/trace-events |
42 | +++ b/hw/vfio/trace-events | 42 | +++ b/hw/vfio/trace-events |
43 | @@ -XXX,XX +XXX,XX @@ vfio_display_edid_write_error(void) "" | 43 | @@ -XXX,XX +XXX,XX @@ vfio_display_edid_write_error(void) "" |
44 | 44 | ||
45 | # migration.c | 45 | # migration.c |
46 | vfio_load_cleanup(const char *name) " (%s)" | 46 | vfio_load_cleanup(const char *name) " (%s)" |
47 | -vfio_load_device_config_state(const char *name) " (%s)" | 47 | -vfio_load_device_config_state(const char *name) " (%s)" |
48 | +vfio_load_device_config_state_start(const char *name) " (%s)" | 48 | +vfio_load_device_config_state_start(const char *name) " (%s)" |
49 | +vfio_load_device_config_state_end(const char *name) " (%s)" | 49 | +vfio_load_device_config_state_end(const char *name) " (%s)" |
50 | vfio_load_state(const char *name, uint64_t data) " (%s) data 0x%"PRIx64 | 50 | vfio_load_state(const char *name, uint64_t data) " (%s) data 0x%"PRIx64 |
51 | vfio_load_state_device_data(const char *name, uint64_t data_size, int ret) " (%s) size %"PRIu64" ret %d" | 51 | vfio_load_state_device_data(const char *name, uint64_t data_size, int ret) " (%s) size %"PRIu64" ret %d" |
52 | vfio_migration_realize(const char *name) " (%s)" | 52 | vfio_migration_realize(const char *name) " (%s)" |
53 | 53 | diff view generated by jsdifflib |
... | ... | ||
---|---|---|---|
7 | variables. | 7 | variables. |
8 | 8 | ||
9 | Using 32-bit counters on 32-bit host platforms should not be a problem | 9 | Using 32-bit counters on 32-bit host platforms should not be a problem |
10 | in practice since they can't realistically address more memory anyway. | 10 | in practice since they can't realistically address more memory anyway. |
11 | 11 | ||
12 | Reviewed-by: Cédric Le Goater <clg@redhat.com> | ||
12 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | 13 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> |
13 | --- | 14 | --- |
14 | hw/vfio/migration.c | 8 ++++---- | 15 | hw/vfio/migration.c | 8 ++++---- |
15 | 1 file changed, 4 insertions(+), 4 deletions(-) | 16 | 1 file changed, 4 insertions(+), 4 deletions(-) |
16 | 17 | ||
... | ... | ||
49 | - bytes_transferred = 0; | 50 | - bytes_transferred = 0; |
50 | + qatomic_set(&bytes_transferred, 0); | 51 | + qatomic_set(&bytes_transferred, 0); |
51 | } | 52 | } |
52 | 53 | ||
53 | /* | 54 | /* |
55 | diff view generated by jsdifflib |
... | ... | ||
---|---|---|---|
16 | @@ -XXX,XX +XXX,XX @@ static ssize_t vfio_save_block(QEMUFile *f, VFIOMigration *migration) | 16 | @@ -XXX,XX +XXX,XX @@ static ssize_t vfio_save_block(QEMUFile *f, VFIOMigration *migration) |
17 | qemu_put_be64(f, VFIO_MIG_FLAG_DEV_DATA_STATE); | 17 | qemu_put_be64(f, VFIO_MIG_FLAG_DEV_DATA_STATE); |
18 | qemu_put_be64(f, data_size); | 18 | qemu_put_be64(f, data_size); |
19 | qemu_put_buffer(f, migration->data_buffer, data_size); | 19 | qemu_put_buffer(f, migration->data_buffer, data_size); |
20 | - qatomic_add(&bytes_transferred, data_size); | 20 | - qatomic_add(&bytes_transferred, data_size); |
21 | + vfio_add_bytes_transferred(data_size); | 21 | + vfio_mig_add_bytes_transferred(data_size); |
22 | 22 | ||
23 | trace_vfio_save_block(migration->vbasedev->name, data_size); | 23 | trace_vfio_save_block(migration->vbasedev->name, data_size); |
24 | 24 | ||
25 | @@ -XXX,XX +XXX,XX @@ void vfio_reset_bytes_transferred(void) | 25 | @@ -XXX,XX +XXX,XX @@ void vfio_reset_bytes_transferred(void) |
26 | qatomic_set(&bytes_transferred, 0); | 26 | qatomic_set(&bytes_transferred, 0); |
27 | } | 27 | } |
28 | 28 | ||
29 | +void vfio_add_bytes_transferred(unsigned long val) | 29 | +void vfio_mig_add_bytes_transferred(unsigned long val) |
30 | +{ | 30 | +{ |
31 | + qatomic_add(&bytes_transferred, val); | 31 | + qatomic_add(&bytes_transferred, val); |
32 | +} | 32 | +} |
33 | + | 33 | + |
34 | /* | 34 | /* |
... | ... | ||
40 | +++ b/include/hw/vfio/vfio-common.h | 40 | +++ b/include/hw/vfio/vfio-common.h |
41 | @@ -XXX,XX +XXX,XX @@ void vfio_unblock_multiple_devices_migration(void); | 41 | @@ -XXX,XX +XXX,XX @@ void vfio_unblock_multiple_devices_migration(void); |
42 | bool vfio_viommu_preset(VFIODevice *vbasedev); | 42 | bool vfio_viommu_preset(VFIODevice *vbasedev); |
43 | int64_t vfio_mig_bytes_transferred(void); | 43 | int64_t vfio_mig_bytes_transferred(void); |
44 | void vfio_reset_bytes_transferred(void); | 44 | void vfio_reset_bytes_transferred(void); |
45 | +void vfio_add_bytes_transferred(unsigned long val); | 45 | +void vfio_mig_add_bytes_transferred(unsigned long val); |
46 | bool vfio_device_state_is_running(VFIODevice *vbasedev); | 46 | bool vfio_device_state_is_running(VFIODevice *vbasedev); |
47 | bool vfio_device_state_is_precopy(VFIODevice *vbasedev); | 47 | bool vfio_device_state_is_precopy(VFIODevice *vbasedev); | diff view generated by jsdifflib |
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | 1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> |
---|---|---|---|
2 | 2 | ||
3 | This way they can also be referenced in other translation | 3 | This way they can also be referenced in other translation |
4 | units than migration.c. | 4 | units than migration.c. |
5 | 5 | ||
6 | Reviewed-by: Cédric Le Goater <clg@redhat.com> | ||
6 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | 7 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> |
7 | --- | 8 | --- |
8 | hw/vfio/migration.c | 17 ----------------- | 9 | hw/vfio/migration.c | 17 ----------------- |
9 | include/hw/vfio/vfio-common.h | 17 +++++++++++++++++ | 10 | include/hw/vfio/vfio-common.h | 17 +++++++++++++++++ |
10 | 2 files changed, 17 insertions(+), 17 deletions(-) | 11 | 2 files changed, 17 insertions(+), 17 deletions(-) |
... | ... | ||
63 | +#define VFIO_MIG_FLAG_DEV_INIT_DATA_SENT (0xffffffffef100005ULL) | 64 | +#define VFIO_MIG_FLAG_DEV_INIT_DATA_SENT (0xffffffffef100005ULL) |
64 | + | 65 | + |
65 | enum { | 66 | enum { |
66 | VFIO_DEVICE_TYPE_PCI = 0, | 67 | VFIO_DEVICE_TYPE_PCI = 0, |
67 | VFIO_DEVICE_TYPE_PLATFORM = 1, | 68 | VFIO_DEVICE_TYPE_PLATFORM = 1, |
69 | diff view generated by jsdifflib |
... | ... | ||
---|---|---|---|
8 | migration code (migration.c) via migration-multifd.h header file. | 8 | migration code (migration.c) via migration-multifd.h header file. |
9 | 9 | ||
10 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | 10 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> |
11 | --- | 11 | --- |
12 | hw/vfio/meson.build | 1 + | 12 | hw/vfio/meson.build | 1 + |
13 | hw/vfio/migration-multifd.c | 31 +++++++++++++++++++++++++++++++ | 13 | hw/vfio/migration-multifd.c | 33 +++++++++++++++++++++++++++++++++ |
14 | hw/vfio/migration-multifd.h | 15 +++++++++++++++ | 14 | hw/vfio/migration-multifd.h | 17 +++++++++++++++++ |
15 | hw/vfio/migration.c | 1 + | 15 | hw/vfio/migration.c | 1 + |
16 | 4 files changed, 48 insertions(+) | 16 | 4 files changed, 52 insertions(+) |
17 | create mode 100644 hw/vfio/migration-multifd.c | 17 | create mode 100644 hw/vfio/migration-multifd.c |
18 | create mode 100644 hw/vfio/migration-multifd.h | 18 | create mode 100644 hw/vfio/migration-multifd.h |
19 | 19 | ||
20 | diff --git a/hw/vfio/meson.build b/hw/vfio/meson.build | 20 | diff --git a/hw/vfio/meson.build b/hw/vfio/meson.build |
21 | index XXXXXXX..XXXXXXX 100644 | 21 | index XXXXXXX..XXXXXXX 100644 |
... | ... | ||
40 | + * | 40 | + * |
41 | + * Copyright (C) 2024,2025 Oracle and/or its affiliates. | 41 | + * Copyright (C) 2024,2025 Oracle and/or its affiliates. |
42 | + * | 42 | + * |
43 | + * This work is licensed under the terms of the GNU GPL, version 2 or later. | 43 | + * This work is licensed under the terms of the GNU GPL, version 2 or later. |
44 | + * See the COPYING file in the top-level directory. | 44 | + * See the COPYING file in the top-level directory. |
45 | + * | ||
46 | + * SPDX-License-Identifier: GPL-2.0-or-later | ||
45 | + */ | 47 | + */ |
46 | + | 48 | + |
47 | +#include "qemu/osdep.h" | 49 | +#include "qemu/osdep.h" |
48 | +#include "hw/vfio/vfio-common.h" | 50 | +#include "hw/vfio/vfio-common.h" |
49 | +#include "migration/misc.h" | 51 | +#include "migration/misc.h" |
... | ... | ||
77 | + * | 79 | + * |
78 | + * Copyright (C) 2024,2025 Oracle and/or its affiliates. | 80 | + * Copyright (C) 2024,2025 Oracle and/or its affiliates. |
79 | + * | 81 | + * |
80 | + * This work is licensed under the terms of the GNU GPL, version 2 or later. | 82 | + * This work is licensed under the terms of the GNU GPL, version 2 or later. |
81 | + * See the COPYING file in the top-level directory. | 83 | + * See the COPYING file in the top-level directory. |
84 | + * | ||
85 | + * SPDX-License-Identifier: GPL-2.0-or-later | ||
82 | + */ | 86 | + */ |
83 | + | 87 | + |
84 | +#ifndef HW_VFIO_MIGRATION_MULTIFD_H | 88 | +#ifndef HW_VFIO_MIGRATION_MULTIFD_H |
85 | +#define HW_VFIO_MIGRATION_MULTIFD_H | 89 | +#define HW_VFIO_MIGRATION_MULTIFD_H |
86 | + | 90 | + |
... | ... | diff view generated by jsdifflib |
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | 1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> |
---|---|---|---|
2 | 2 | ||
3 | Add vfio_multifd_transfer_supported() function that tells whether the | 3 | Add vfio_multifd_transfer_supported() function that tells whether the |
4 | multifd device state transfer is supported. | 4 | multifd device state transfer is supported. |
5 | 5 | ||
6 | Reviewed-by: Cédric Le Goater <clg@redhat.com> | ||
6 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | 7 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> |
7 | --- | 8 | --- |
8 | hw/vfio/migration-multifd.c | 6 ++++++ | 9 | hw/vfio/migration-multifd.c | 6 ++++++ |
9 | hw/vfio/migration-multifd.h | 2 ++ | 10 | hw/vfio/migration-multifd.h | 2 ++ |
10 | 2 files changed, 8 insertions(+) | 11 | 2 files changed, 8 insertions(+) |
11 | 12 | ||
12 | diff --git a/hw/vfio/migration-multifd.c b/hw/vfio/migration-multifd.c | 13 | diff --git a/hw/vfio/migration-multifd.c b/hw/vfio/migration-multifd.c |
13 | index XXXXXXX..XXXXXXX 100644 | 14 | index XXXXXXX..XXXXXXX 100644 |
14 | --- a/hw/vfio/migration-multifd.c | 15 | --- a/hw/vfio/migration-multifd.c |
15 | +++ b/hw/vfio/migration-multifd.c | 16 | +++ b/hw/vfio/migration-multifd.c |
16 | @@ -XXX,XX +XXX,XX @@ static VFIOStateBuffer *vfio_state_buffers_at(VFIOStateBuffers *bufs, guint idx) | 17 | @@ -XXX,XX +XXX,XX @@ typedef struct VFIODeviceStatePacket { |
17 | { | 18 | uint32_t flags; |
18 | return &g_array_index(bufs->array, VFIOStateBuffer, idx); | 19 | uint8_t data[0]; |
19 | } | 20 | } QEMU_PACKED VFIODeviceStatePacket; |
20 | + | 21 | + |
21 | +bool vfio_multifd_transfer_supported(void) | 22 | +bool vfio_multifd_transfer_supported(void) |
22 | +{ | 23 | +{ |
23 | + return multifd_device_state_supported() && | 24 | + return multifd_device_state_supported() && |
24 | + migrate_send_switchover_start(); | 25 | + migrate_send_switchover_start(); |
... | ... | ||
32 | #include "hw/vfio/vfio-common.h" | 33 | #include "hw/vfio/vfio-common.h" |
33 | 34 | ||
34 | +bool vfio_multifd_transfer_supported(void); | 35 | +bool vfio_multifd_transfer_supported(void); |
35 | + | 36 | + |
36 | #endif | 37 | #endif |
38 | diff view generated by jsdifflib |
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | 1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> |
---|---|---|---|
2 | 2 | ||
3 | Add support for VFIOMultifd data structure that will contain most of the | 3 | Add multifd setup/cleanup functions and an associated VFIOMultifd data |
4 | receive-side data together with its init/cleanup methods. | 4 | structure that will contain most of the receive-side data together |
5 | with its init/cleanup methods. | ||
5 | 6 | ||
6 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | 7 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> |
7 | --- | 8 | --- |
8 | hw/vfio/migration-multifd.c | 33 +++++++++++++++++++++++++++++++++ | 9 | hw/vfio/migration-multifd.c | 44 +++++++++++++++++++++++++++++++++++ |
9 | hw/vfio/migration-multifd.h | 8 ++++++++ | 10 | hw/vfio/migration-multifd.h | 4 ++++ |
10 | hw/vfio/migration.c | 29 +++++++++++++++++++++++++++-- | ||
11 | include/hw/vfio/vfio-common.h | 3 +++ | 11 | include/hw/vfio/vfio-common.h | 3 +++ |
12 | 4 files changed, 71 insertions(+), 2 deletions(-) | 12 | 3 files changed, 51 insertions(+) |
13 | 13 | ||
14 | diff --git a/hw/vfio/migration-multifd.c b/hw/vfio/migration-multifd.c | 14 | diff --git a/hw/vfio/migration-multifd.c b/hw/vfio/migration-multifd.c |
15 | index XXXXXXX..XXXXXXX 100644 | 15 | index XXXXXXX..XXXXXXX 100644 |
16 | --- a/hw/vfio/migration-multifd.c | 16 | --- a/hw/vfio/migration-multifd.c |
17 | +++ b/hw/vfio/migration-multifd.c | 17 | +++ b/hw/vfio/migration-multifd.c |
18 | @@ -XXX,XX +XXX,XX @@ typedef struct VFIOStateBuffer { | 18 | @@ -XXX,XX +XXX,XX @@ typedef struct VFIODeviceStatePacket { |
19 | size_t len; | 19 | uint8_t data[0]; |
20 | } VFIOStateBuffer; | 20 | } QEMU_PACKED VFIODeviceStatePacket; |
21 | 21 | ||
22 | +typedef struct VFIOMultifd { | 22 | +typedef struct VFIOMultifd { |
23 | +} VFIOMultifd; | 23 | +} VFIOMultifd; |
24 | + | 24 | + |
25 | static void vfio_state_buffer_clear(gpointer data) | 25 | +static VFIOMultifd *vfio_multifd_new(void) |
26 | { | ||
27 | VFIOStateBuffer *lb = data; | ||
28 | @@ -XXX,XX +XXX,XX @@ static VFIOStateBuffer *vfio_state_buffers_at(VFIOStateBuffers *bufs, guint idx) | ||
29 | return &g_array_index(bufs->array, VFIOStateBuffer, idx); | ||
30 | } | ||
31 | |||
32 | +VFIOMultifd *vfio_multifd_new(void) | ||
33 | +{ | 26 | +{ |
34 | + VFIOMultifd *multifd = g_new(VFIOMultifd, 1); | 27 | + VFIOMultifd *multifd = g_new(VFIOMultifd, 1); |
35 | + | 28 | + |
36 | + return multifd; | 29 | + return multifd; |
37 | +} | 30 | +} |
38 | + | 31 | + |
39 | +void vfio_multifd_free(VFIOMultifd *multifd) | 32 | +static void vfio_multifd_free(VFIOMultifd *multifd) |
40 | +{ | 33 | +{ |
41 | + g_free(multifd); | 34 | + g_free(multifd); |
35 | +} | ||
36 | + | ||
37 | +void vfio_multifd_cleanup(VFIODevice *vbasedev) | ||
38 | +{ | ||
39 | + VFIOMigration *migration = vbasedev->migration; | ||
40 | + | ||
41 | + g_clear_pointer(&migration->multifd, vfio_multifd_free); | ||
42 | +} | 42 | +} |
43 | + | 43 | + |
44 | bool vfio_multifd_transfer_supported(void) | 44 | bool vfio_multifd_transfer_supported(void) |
45 | { | 45 | { |
46 | return multifd_device_state_supported() && | 46 | return multifd_device_state_supported() && |
... | ... | ||
50 | +bool vfio_multifd_transfer_enabled(VFIODevice *vbasedev) | 50 | +bool vfio_multifd_transfer_enabled(VFIODevice *vbasedev) |
51 | +{ | 51 | +{ |
52 | + return false; | 52 | + return false; |
53 | +} | 53 | +} |
54 | + | 54 | + |
55 | +bool vfio_multifd_transfer_setup(VFIODevice *vbasedev, Error **errp) | 55 | +bool vfio_multifd_setup(VFIODevice *vbasedev, bool alloc_multifd, Error **errp) |
56 | +{ | 56 | +{ |
57 | + if (vfio_multifd_transfer_enabled(vbasedev) && | 57 | + VFIOMigration *migration = vbasedev->migration; |
58 | + !vfio_multifd_transfer_supported()) { | 58 | + |
59 | + error_setg(errp, | 59 | + if (!vfio_multifd_transfer_enabled(vbasedev)) { |
60 | + "%s: Multifd device transfer requested but unsupported in the current config", | 60 | + /* Nothing further to check or do */ |
61 | + vbasedev->name); | 61 | + return true; |
62 | + return false; | 62 | + } |
63 | + | ||
64 | + if (alloc_multifd) { | ||
65 | + assert(!migration->multifd); | ||
66 | + migration->multifd = vfio_multifd_new(); | ||
63 | + } | 67 | + } |
64 | + | 68 | + |
65 | + return true; | 69 | + return true; |
66 | +} | 70 | +} |
67 | diff --git a/hw/vfio/migration-multifd.h b/hw/vfio/migration-multifd.h | 71 | diff --git a/hw/vfio/migration-multifd.h b/hw/vfio/migration-multifd.h |
... | ... | ||
70 | +++ b/hw/vfio/migration-multifd.h | 74 | +++ b/hw/vfio/migration-multifd.h |
71 | @@ -XXX,XX +XXX,XX @@ | 75 | @@ -XXX,XX +XXX,XX @@ |
72 | 76 | ||
73 | #include "hw/vfio/vfio-common.h" | 77 | #include "hw/vfio/vfio-common.h" |
74 | 78 | ||
75 | +typedef struct VFIOMultifd VFIOMultifd; | 79 | +bool vfio_multifd_setup(VFIODevice *vbasedev, bool alloc_multifd, Error **errp); |
76 | + | 80 | +void vfio_multifd_cleanup(VFIODevice *vbasedev); |
77 | +VFIOMultifd *vfio_multifd_new(void); | ||
78 | +void vfio_multifd_free(VFIOMultifd *multifd); | ||
79 | + | 81 | + |
80 | bool vfio_multifd_transfer_supported(void); | 82 | bool vfio_multifd_transfer_supported(void); |
81 | +bool vfio_multifd_transfer_enabled(VFIODevice *vbasedev); | 83 | +bool vfio_multifd_transfer_enabled(VFIODevice *vbasedev); |
82 | + | ||
83 | +bool vfio_multifd_transfer_setup(VFIODevice *vbasedev, Error **errp); | ||
84 | 84 | ||
85 | #endif | 85 | #endif |
86 | diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c | ||
87 | index XXXXXXX..XXXXXXX 100644 | ||
88 | --- a/hw/vfio/migration.c | ||
89 | +++ b/hw/vfio/migration.c | ||
90 | @@ -XXX,XX +XXX,XX @@ static void vfio_save_state(QEMUFile *f, void *opaque) | ||
91 | static int vfio_load_setup(QEMUFile *f, void *opaque, Error **errp) | ||
92 | { | ||
93 | VFIODevice *vbasedev = opaque; | ||
94 | + VFIOMigration *migration = vbasedev->migration; | ||
95 | + int ret; | ||
96 | + | ||
97 | + if (!vfio_multifd_transfer_setup(vbasedev, errp)) { | ||
98 | + return -EINVAL; | ||
99 | + } | ||
100 | + | ||
101 | + ret = vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_RESUMING, | ||
102 | + migration->device_state, errp); | ||
103 | + if (ret) { | ||
104 | + return ret; | ||
105 | + } | ||
106 | |||
107 | - return vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_RESUMING, | ||
108 | - vbasedev->migration->device_state, errp); | ||
109 | + if (vfio_multifd_transfer_enabled(vbasedev)) { | ||
110 | + assert(!migration->multifd); | ||
111 | + migration->multifd = vfio_multifd_new(); | ||
112 | + } | ||
113 | + | ||
114 | + return 0; | ||
115 | +} | ||
116 | + | ||
117 | +static void vfio_multifd_cleanup(VFIODevice *vbasedev) | ||
118 | +{ | ||
119 | + VFIOMigration *migration = vbasedev->migration; | ||
120 | + | ||
121 | + g_clear_pointer(&migration->multifd, vfio_multifd_free); | ||
122 | } | ||
123 | |||
124 | static int vfio_load_cleanup(void *opaque) | ||
125 | { | ||
126 | VFIODevice *vbasedev = opaque; | ||
127 | |||
128 | + vfio_multifd_cleanup(vbasedev); | ||
129 | + | ||
130 | vfio_migration_cleanup(vbasedev); | ||
131 | trace_vfio_load_cleanup(vbasedev->name); | ||
132 | |||
133 | diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h | 86 | diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h |
134 | index XXXXXXX..XXXXXXX 100644 | 87 | index XXXXXXX..XXXXXXX 100644 |
135 | --- a/include/hw/vfio/vfio-common.h | 88 | --- a/include/hw/vfio/vfio-common.h |
136 | +++ b/include/hw/vfio/vfio-common.h | 89 | +++ b/include/hw/vfio/vfio-common.h |
137 | @@ -XXX,XX +XXX,XX @@ typedef struct VFIORegion { | 90 | @@ -XXX,XX +XXX,XX @@ typedef struct VFIORegion { |
... | ... | diff view generated by jsdifflib |
New patch | |||
---|---|---|---|
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | ||
1 | 2 | ||
3 | Wire VFIO multifd transfer specific setup and cleanup functions into | ||
4 | general VFIO load/save setup and cleanup methods. | ||
5 | |||
6 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | ||
7 | --- | ||
8 | hw/vfio/migration.c | 24 ++++++++++++++++++++++-- | ||
9 | 1 file changed, 22 insertions(+), 2 deletions(-) | ||
10 | |||
11 | diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c | ||
12 | index XXXXXXX..XXXXXXX 100644 | ||
13 | --- a/hw/vfio/migration.c | ||
14 | +++ b/hw/vfio/migration.c | ||
15 | @@ -XXX,XX +XXX,XX @@ static int vfio_save_setup(QEMUFile *f, void *opaque, Error **errp) | ||
16 | uint64_t stop_copy_size = VFIO_MIG_DEFAULT_DATA_BUFFER_SIZE; | ||
17 | int ret; | ||
18 | |||
19 | + if (!vfio_multifd_setup(vbasedev, false, errp)) { | ||
20 | + return -EINVAL; | ||
21 | + } | ||
22 | + | ||
23 | qemu_put_be64(f, VFIO_MIG_FLAG_DEV_SETUP_STATE); | ||
24 | |||
25 | vfio_query_stop_copy_size(vbasedev, &stop_copy_size); | ||
26 | @@ -XXX,XX +XXX,XX @@ static void vfio_save_cleanup(void *opaque) | ||
27 | Error *local_err = NULL; | ||
28 | int ret; | ||
29 | |||
30 | + /* Currently a NOP, done for symmetry with load_cleanup() */ | ||
31 | + vfio_multifd_cleanup(vbasedev); | ||
32 | + | ||
33 | /* | ||
34 | * Changing device state from STOP_COPY to STOP can take time. Do it here, | ||
35 | * after migration has completed, so it won't increase downtime. | ||
36 | @@ -XXX,XX +XXX,XX @@ static void vfio_save_state(QEMUFile *f, void *opaque) | ||
37 | static int vfio_load_setup(QEMUFile *f, void *opaque, Error **errp) | ||
38 | { | ||
39 | VFIODevice *vbasedev = opaque; | ||
40 | + VFIOMigration *migration = vbasedev->migration; | ||
41 | + int ret; | ||
42 | |||
43 | - return vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_RESUMING, | ||
44 | - vbasedev->migration->device_state, errp); | ||
45 | + if (!vfio_multifd_setup(vbasedev, true, errp)) { | ||
46 | + return -EINVAL; | ||
47 | + } | ||
48 | + | ||
49 | + ret = vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_RESUMING, | ||
50 | + migration->device_state, errp); | ||
51 | + if (ret) { | ||
52 | + return ret; | ||
53 | + } | ||
54 | + | ||
55 | + return 0; | ||
56 | } | ||
57 | |||
58 | static int vfio_load_cleanup(void *opaque) | ||
59 | { | ||
60 | VFIODevice *vbasedev = opaque; | ||
61 | |||
62 | + vfio_multifd_cleanup(vbasedev); | ||
63 | + | ||
64 | vfio_migration_cleanup(vbasedev); | ||
65 | trace_vfio_load_cleanup(vbasedev->name); | diff view generated by jsdifflib |
... | ... | ||
---|---|---|---|
11 | The last such VFIO device state packet should have | 11 | The last such VFIO device state packet should have |
12 | VFIO_DEVICE_STATE_CONFIG_STATE flag set and carry the device config state. | 12 | VFIO_DEVICE_STATE_CONFIG_STATE flag set and carry the device config state. |
13 | 13 | ||
14 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | 14 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> |
15 | --- | 15 | --- |
16 | hw/vfio/migration-multifd.c | 103 ++++++++++++++++++++++++++++++++++++ | 16 | hw/vfio/migration-multifd.c | 163 ++++++++++++++++++++++++++++++++++++ |
17 | hw/vfio/migration-multifd.h | 3 ++ | 17 | hw/vfio/migration-multifd.h | 3 + |
18 | hw/vfio/migration.c | 1 + | 18 | hw/vfio/migration.c | 1 + |
19 | hw/vfio/trace-events | 1 + | 19 | hw/vfio/trace-events | 1 + |
20 | 4 files changed, 108 insertions(+) | 20 | 4 files changed, 168 insertions(+) |
21 | 21 | ||
22 | diff --git a/hw/vfio/migration-multifd.c b/hw/vfio/migration-multifd.c | 22 | diff --git a/hw/vfio/migration-multifd.c b/hw/vfio/migration-multifd.c |
23 | index XXXXXXX..XXXXXXX 100644 | 23 | index XXXXXXX..XXXXXXX 100644 |
24 | --- a/hw/vfio/migration-multifd.c | 24 | --- a/hw/vfio/migration-multifd.c |
25 | +++ b/hw/vfio/migration-multifd.c | 25 | +++ b/hw/vfio/migration-multifd.c |
26 | @@ -XXX,XX +XXX,XX @@ typedef struct VFIOStateBuffer { | 26 | @@ -XXX,XX +XXX,XX @@ typedef struct VFIODeviceStatePacket { |
27 | } VFIOStateBuffer; | 27 | uint8_t data[0]; |
28 | 28 | } QEMU_PACKED VFIODeviceStatePacket; | |
29 | |||
30 | +/* type safety */ | ||
31 | +typedef struct VFIOStateBuffers { | ||
32 | + GArray *array; | ||
33 | +} VFIOStateBuffers; | ||
34 | + | ||
35 | +typedef struct VFIOStateBuffer { | ||
36 | + bool is_present; | ||
37 | + char *data; | ||
38 | + size_t len; | ||
39 | +} VFIOStateBuffer; | ||
40 | + | ||
29 | typedef struct VFIOMultifd { | 41 | typedef struct VFIOMultifd { |
30 | + VFIOStateBuffers load_bufs; | 42 | + VFIOStateBuffers load_bufs; |
31 | + QemuCond load_bufs_buffer_ready_cond; | 43 | + QemuCond load_bufs_buffer_ready_cond; |
32 | + QemuMutex load_bufs_mutex; /* Lock order: this lock -> BQL */ | 44 | + QemuMutex load_bufs_mutex; /* Lock order: this lock -> BQL */ |
33 | + uint32_t load_buf_idx; | 45 | + uint32_t load_buf_idx; |
34 | + uint32_t load_buf_idx_last; | 46 | + uint32_t load_buf_idx_last; |
35 | } VFIOMultifd; | 47 | } VFIOMultifd; |
36 | 48 | ||
37 | static void vfio_state_buffer_clear(gpointer data) | 49 | +static void vfio_state_buffer_clear(gpointer data) |
38 | @@ -XXX,XX +XXX,XX @@ static VFIOStateBuffer *vfio_state_buffers_at(VFIOStateBuffers *bufs, guint idx) | 50 | +{ |
39 | return &g_array_index(bufs->array, VFIOStateBuffer, idx); | 51 | + VFIOStateBuffer *lb = data; |
40 | } | 52 | + |
41 | 53 | + if (!lb->is_present) { | |
54 | + return; | ||
55 | + } | ||
56 | + | ||
57 | + g_clear_pointer(&lb->data, g_free); | ||
58 | + lb->is_present = false; | ||
59 | +} | ||
60 | + | ||
61 | +static void vfio_state_buffers_init(VFIOStateBuffers *bufs) | ||
62 | +{ | ||
63 | + bufs->array = g_array_new(FALSE, TRUE, sizeof(VFIOStateBuffer)); | ||
64 | + g_array_set_clear_func(bufs->array, vfio_state_buffer_clear); | ||
65 | +} | ||
66 | + | ||
67 | +static void vfio_state_buffers_destroy(VFIOStateBuffers *bufs) | ||
68 | +{ | ||
69 | + g_clear_pointer(&bufs->array, g_array_unref); | ||
70 | +} | ||
71 | + | ||
72 | +static void vfio_state_buffers_assert_init(VFIOStateBuffers *bufs) | ||
73 | +{ | ||
74 | + assert(bufs->array); | ||
75 | +} | ||
76 | + | ||
77 | +static unsigned int vfio_state_buffers_size_get(VFIOStateBuffers *bufs) | ||
78 | +{ | ||
79 | + return bufs->array->len; | ||
80 | +} | ||
81 | + | ||
82 | +static void vfio_state_buffers_size_set(VFIOStateBuffers *bufs, | ||
83 | + unsigned int size) | ||
84 | +{ | ||
85 | + g_array_set_size(bufs->array, size); | ||
86 | +} | ||
87 | + | ||
88 | +static VFIOStateBuffer *vfio_state_buffers_at(VFIOStateBuffers *bufs, | ||
89 | + unsigned int idx) | ||
90 | +{ | ||
91 | + return &g_array_index(bufs->array, VFIOStateBuffer, idx); | ||
92 | +} | ||
93 | + | ||
94 | +/* called with load_bufs_mutex locked */ | ||
42 | +static bool vfio_load_state_buffer_insert(VFIODevice *vbasedev, | 95 | +static bool vfio_load_state_buffer_insert(VFIODevice *vbasedev, |
43 | + VFIODeviceStatePacket *packet, | 96 | + VFIODeviceStatePacket *packet, |
44 | + size_t packet_total_size, | 97 | + size_t packet_total_size, |
45 | + Error **errp) | 98 | + Error **errp) |
46 | +{ | 99 | +{ |
... | ... | ||
53 | + vfio_state_buffers_size_set(&multifd->load_bufs, packet->idx + 1); | 106 | + vfio_state_buffers_size_set(&multifd->load_bufs, packet->idx + 1); |
54 | + } | 107 | + } |
55 | + | 108 | + |
56 | + lb = vfio_state_buffers_at(&multifd->load_bufs, packet->idx); | 109 | + lb = vfio_state_buffers_at(&multifd->load_bufs, packet->idx); |
57 | + if (lb->is_present) { | 110 | + if (lb->is_present) { |
58 | + error_setg(errp, "state buffer %" PRIu32 " already filled", | 111 | + error_setg(errp, "%s: state buffer %" PRIu32 " already filled", |
59 | + packet->idx); | 112 | + vbasedev->name, packet->idx); |
60 | + return false; | 113 | + return false; |
61 | + } | 114 | + } |
62 | + | 115 | + |
63 | + assert(packet->idx >= multifd->load_buf_idx); | 116 | + assert(packet->idx >= multifd->load_buf_idx); |
64 | + | 117 | + |
... | ... | ||
67 | + lb->is_present = true; | 120 | + lb->is_present = true; |
68 | + | 121 | + |
69 | + return true; | 122 | + return true; |
70 | +} | 123 | +} |
71 | + | 124 | + |
72 | +bool vfio_load_state_buffer(void *opaque, char *data, size_t data_size, | 125 | +bool vfio_multifd_load_state_buffer(void *opaque, char *data, size_t data_size, |
73 | + Error **errp) | 126 | + Error **errp) |
74 | +{ | 127 | +{ |
75 | + VFIODevice *vbasedev = opaque; | 128 | + VFIODevice *vbasedev = opaque; |
76 | + VFIOMigration *migration = vbasedev->migration; | 129 | + VFIOMigration *migration = vbasedev->migration; |
77 | + VFIOMultifd *multifd = migration->multifd; | 130 | + VFIOMultifd *multifd = migration->multifd; |
78 | + VFIODeviceStatePacket *packet = (VFIODeviceStatePacket *)data; | 131 | + VFIODeviceStatePacket *packet = (VFIODeviceStatePacket *)data; |
132 | + | ||
133 | + if (!vfio_multifd_transfer_enabled(vbasedev)) { | ||
134 | + error_setg(errp, | ||
135 | + "%s: got device state packet but not doing multifd transfer", | ||
136 | + vbasedev->name); | ||
137 | + return false; | ||
138 | + } | ||
139 | + | ||
140 | + assert(multifd); | ||
141 | + | ||
142 | + if (data_size < sizeof(*packet)) { | ||
143 | + error_setg(errp, "%s: packet too short at %zu (min is %zu)", | ||
144 | + vbasedev->name, data_size, sizeof(*packet)); | ||
145 | + return false; | ||
146 | + } | ||
147 | + | ||
148 | + if (packet->version != VFIO_DEVICE_STATE_PACKET_VER_CURRENT) { | ||
149 | + error_setg(errp, "%s: packet has unknown version %" PRIu32, | ||
150 | + vbasedev->name, packet->version); | ||
151 | + return false; | ||
152 | + } | ||
153 | + | ||
154 | + if (packet->idx == UINT32_MAX) { | ||
155 | + error_setg(errp, "%s: packet index is invalid", vbasedev->name); | ||
156 | + return false; | ||
157 | + } | ||
158 | + | ||
159 | + trace_vfio_load_state_device_buffer_incoming(vbasedev->name, packet->idx); | ||
79 | + | 160 | + |
80 | + /* | 161 | + /* |
81 | + * Holding BQL here would violate the lock order and can cause | 162 | + * Holding BQL here would violate the lock order and can cause |
82 | + * a deadlock once we attempt to lock load_bufs_mutex below. | 163 | + * a deadlock once we attempt to lock load_bufs_mutex below. |
83 | + */ | 164 | + */ |
84 | + assert(!bql_locked()); | 165 | + assert(!bql_locked()); |
85 | + | 166 | + |
86 | + if (!vfio_multifd_transfer_enabled(vbasedev)) { | 167 | + WITH_QEMU_LOCK_GUARD(&multifd->load_bufs_mutex) { |
87 | + error_setg(errp, | 168 | + /* config state packet should be the last one in the stream */ |
88 | + "got device state packet but not doing multifd transfer"); | 169 | + if (packet->flags & VFIO_DEVICE_STATE_CONFIG_STATE) { |
89 | + return false; | 170 | + multifd->load_buf_idx_last = packet->idx; |
90 | + } | 171 | + } |
91 | + | 172 | + |
92 | + assert(multifd); | 173 | + if (!vfio_load_state_buffer_insert(vbasedev, packet, data_size, |
93 | + | 174 | + errp)) { |
94 | + if (data_size < sizeof(*packet)) { | 175 | + return false; |
95 | + error_setg(errp, "packet too short at %zu (min is %zu)", | 176 | + } |
96 | + data_size, sizeof(*packet)); | 177 | + |
97 | + return false; | 178 | + qemu_cond_signal(&multifd->load_bufs_buffer_ready_cond); |
98 | + } | 179 | + } |
99 | + | ||
100 | + if (packet->version != VFIO_DEVICE_STATE_PACKET_VER_CURRENT) { | ||
101 | + error_setg(errp, "packet has unknown version %" PRIu32, | ||
102 | + packet->version); | ||
103 | + return false; | ||
104 | + } | ||
105 | + | ||
106 | + if (packet->idx == UINT32_MAX) { | ||
107 | + error_setg(errp, "packet has too high idx"); | ||
108 | + return false; | ||
109 | + } | ||
110 | + | ||
111 | + trace_vfio_load_state_device_buffer_incoming(vbasedev->name, packet->idx); | ||
112 | + | ||
113 | + QEMU_LOCK_GUARD(&multifd->load_bufs_mutex); | ||
114 | + | ||
115 | + /* config state packet should be the last one in the stream */ | ||
116 | + if (packet->flags & VFIO_DEVICE_STATE_CONFIG_STATE) { | ||
117 | + multifd->load_buf_idx_last = packet->idx; | ||
118 | + } | ||
119 | + | ||
120 | + if (!vfio_load_state_buffer_insert(vbasedev, packet, data_size, errp)) { | ||
121 | + return false; | ||
122 | + } | ||
123 | + | ||
124 | + qemu_cond_signal(&multifd->load_bufs_buffer_ready_cond); | ||
125 | + | 180 | + |
126 | + return true; | 181 | + return true; |
127 | +} | 182 | +} |
128 | + | 183 | + |
129 | VFIOMultifd *vfio_multifd_new(void) | 184 | static VFIOMultifd *vfio_multifd_new(void) |
130 | { | 185 | { |
131 | VFIOMultifd *multifd = g_new(VFIOMultifd, 1); | 186 | VFIOMultifd *multifd = g_new(VFIOMultifd, 1); |
132 | 187 | ||
133 | + vfio_state_buffers_init(&multifd->load_bufs); | 188 | + vfio_state_buffers_init(&multifd->load_bufs); |
134 | + | 189 | + |
... | ... | ||
139 | + qemu_cond_init(&multifd->load_bufs_buffer_ready_cond); | 194 | + qemu_cond_init(&multifd->load_bufs_buffer_ready_cond); |
140 | + | 195 | + |
141 | return multifd; | 196 | return multifd; |
142 | } | 197 | } |
143 | 198 | ||
144 | void vfio_multifd_free(VFIOMultifd *multifd) | 199 | static void vfio_multifd_free(VFIOMultifd *multifd) |
145 | { | 200 | { |
201 | + vfio_state_buffers_destroy(&multifd->load_bufs); | ||
146 | + qemu_cond_destroy(&multifd->load_bufs_buffer_ready_cond); | 202 | + qemu_cond_destroy(&multifd->load_bufs_buffer_ready_cond); |
147 | + qemu_mutex_destroy(&multifd->load_bufs_mutex); | 203 | + qemu_mutex_destroy(&multifd->load_bufs_mutex); |
148 | + | 204 | + |
149 | g_free(multifd); | 205 | g_free(multifd); |
150 | } | 206 | } |
151 | 207 | ||
152 | diff --git a/hw/vfio/migration-multifd.h b/hw/vfio/migration-multifd.h | 208 | diff --git a/hw/vfio/migration-multifd.h b/hw/vfio/migration-multifd.h |
153 | index XXXXXXX..XXXXXXX 100644 | 209 | index XXXXXXX..XXXXXXX 100644 |
154 | --- a/hw/vfio/migration-multifd.h | 210 | --- a/hw/vfio/migration-multifd.h |
155 | +++ b/hw/vfio/migration-multifd.h | 211 | +++ b/hw/vfio/migration-multifd.h |
156 | @@ -XXX,XX +XXX,XX @@ bool vfio_multifd_transfer_enabled(VFIODevice *vbasedev); | 212 | @@ -XXX,XX +XXX,XX @@ void vfio_multifd_cleanup(VFIODevice *vbasedev); |
157 | 213 | bool vfio_multifd_transfer_supported(void); | |
158 | bool vfio_multifd_transfer_setup(VFIODevice *vbasedev, Error **errp); | 214 | bool vfio_multifd_transfer_enabled(VFIODevice *vbasedev); |
159 | 215 | ||
160 | +bool vfio_load_state_buffer(void *opaque, char *data, size_t data_size, | 216 | +bool vfio_multifd_load_state_buffer(void *opaque, char *data, size_t data_size, |
161 | + Error **errp); | 217 | + Error **errp); |
162 | + | 218 | + |
163 | #endif | 219 | #endif |
164 | diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c | 220 | diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c |
165 | index XXXXXXX..XXXXXXX 100644 | 221 | index XXXXXXX..XXXXXXX 100644 |
166 | --- a/hw/vfio/migration.c | 222 | --- a/hw/vfio/migration.c |
167 | +++ b/hw/vfio/migration.c | 223 | +++ b/hw/vfio/migration.c |
168 | @@ -XXX,XX +XXX,XX @@ static const SaveVMHandlers savevm_vfio_handlers = { | 224 | @@ -XXX,XX +XXX,XX @@ static const SaveVMHandlers savevm_vfio_handlers = { |
169 | .load_setup = vfio_load_setup, | 225 | .load_setup = vfio_load_setup, |
170 | .load_cleanup = vfio_load_cleanup, | 226 | .load_cleanup = vfio_load_cleanup, |
171 | .load_state = vfio_load_state, | 227 | .load_state = vfio_load_state, |
172 | + .load_state_buffer = vfio_load_state_buffer, | 228 | + .load_state_buffer = vfio_multifd_load_state_buffer, |
173 | .switchover_ack_needed = vfio_switchover_ack_needed, | 229 | .switchover_ack_needed = vfio_switchover_ack_needed, |
174 | }; | 230 | }; |
175 | 231 | ||
176 | diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events | 232 | diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events |
177 | index XXXXXXX..XXXXXXX 100644 | 233 | index XXXXXXX..XXXXXXX 100644 |
... | ... | diff view generated by jsdifflib |
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | 1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> |
---|---|---|---|
2 | |||
3 | Add a thread which loads the VFIO device state buffers that were received | ||
4 | via multifd. | ||
5 | |||
6 | Each VFIO device that has multifd device state transfer enabled has one | ||
7 | such thread, which is created using migration core API | ||
8 | qemu_loadvm_start_load_thread(). | ||
2 | 9 | ||
3 | Since it's important to finish loading device state transferred via the | 10 | Since it's important to finish loading device state transferred via the |
4 | main migration channel (via save_live_iterate SaveVMHandler) before | 11 | main migration channel (via save_live_iterate SaveVMHandler) before |
5 | starting loading the data asynchronously transferred via multifd the thread | 12 | starting loading the data asynchronously transferred via multifd the thread |
6 | doing the actual loading of the multifd transferred data is only started | 13 | doing the actual loading of the multifd transferred data is only started |
... | ... | ||
16 | MIG_CMD_SWITCHOVER_START) so by the time MIG_CMD_SWITCHOVER_START is | 23 | MIG_CMD_SWITCHOVER_START) so by the time MIG_CMD_SWITCHOVER_START is |
17 | processed all the proceeding data must have already been loaded. | 24 | processed all the proceeding data must have already been loaded. |
18 | 25 | ||
19 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | 26 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> |
20 | --- | 27 | --- |
21 | hw/vfio/migration-multifd.c | 225 ++++++++++++++++++++++++++++++++++++ | 28 | hw/vfio/migration-multifd.c | 226 ++++++++++++++++++++++++++++++++++++ |
22 | hw/vfio/migration-multifd.h | 2 + | 29 | hw/vfio/migration-multifd.h | 2 + |
23 | hw/vfio/migration.c | 12 ++ | 30 | hw/vfio/migration.c | 12 ++ |
24 | hw/vfio/trace-events | 5 + | 31 | hw/vfio/trace-events | 7 ++ |
25 | 4 files changed, 244 insertions(+) | 32 | 4 files changed, 247 insertions(+) |
26 | 33 | ||
27 | diff --git a/hw/vfio/migration-multifd.c b/hw/vfio/migration-multifd.c | 34 | diff --git a/hw/vfio/migration-multifd.c b/hw/vfio/migration-multifd.c |
28 | index XXXXXXX..XXXXXXX 100644 | 35 | index XXXXXXX..XXXXXXX 100644 |
29 | --- a/hw/vfio/migration-multifd.c | 36 | --- a/hw/vfio/migration-multifd.c |
30 | +++ b/hw/vfio/migration-multifd.c | 37 | +++ b/hw/vfio/migration-multifd.c |
31 | @@ -XXX,XX +XXX,XX @@ typedef struct VFIOStateBuffer { | 38 | @@ -XXX,XX +XXX,XX @@ typedef struct VFIOStateBuffer { |
32 | } VFIOStateBuffer; | 39 | } VFIOStateBuffer; |
33 | 40 | ||
34 | typedef struct VFIOMultifd { | 41 | typedef struct VFIOMultifd { |
35 | + QemuThread load_bufs_thread; | ||
36 | + bool load_bufs_thread_running; | 42 | + bool load_bufs_thread_running; |
37 | + bool load_bufs_thread_want_exit; | 43 | + bool load_bufs_thread_want_exit; |
38 | + | 44 | + |
39 | VFIOStateBuffers load_bufs; | 45 | VFIOStateBuffers load_bufs; |
40 | QemuCond load_bufs_buffer_ready_cond; | 46 | QemuCond load_bufs_buffer_ready_cond; |
41 | + QemuCond load_bufs_thread_finished_cond; | 47 | + QemuCond load_bufs_thread_finished_cond; |
42 | QemuMutex load_bufs_mutex; /* Lock order: this lock -> BQL */ | 48 | QemuMutex load_bufs_mutex; /* Lock order: this lock -> BQL */ |
43 | uint32_t load_buf_idx; | 49 | uint32_t load_buf_idx; |
44 | uint32_t load_buf_idx_last; | 50 | uint32_t load_buf_idx_last; |
45 | @@ -XXX,XX +XXX,XX @@ bool vfio_load_state_buffer(void *opaque, char *data, size_t data_size, | 51 | @@ -XXX,XX +XXX,XX @@ bool vfio_multifd_load_state_buffer(void *opaque, char *data, size_t data_size, |
46 | return true; | 52 | return true; |
47 | } | 53 | } |
48 | 54 | ||
49 | +static int vfio_load_bufs_thread_load_config(VFIODevice *vbasedev) | 55 | +static bool vfio_load_bufs_thread_load_config(VFIODevice *vbasedev, |
50 | +{ | 56 | + Error **errp) |
51 | + return -EINVAL; | 57 | +{ |
58 | + error_setg(errp, "not yet there"); | ||
59 | + return false; | ||
52 | +} | 60 | +} |
53 | + | 61 | + |
54 | +static VFIOStateBuffer *vfio_load_state_buffer_get(VFIOMultifd *multifd) | 62 | +static VFIOStateBuffer *vfio_load_state_buffer_get(VFIOMultifd *multifd) |
55 | +{ | 63 | +{ |
56 | + VFIOStateBuffer *lb; | 64 | + VFIOStateBuffer *lb; |
57 | + guint bufs_len; | 65 | + unsigned int bufs_len; |
58 | + | 66 | + |
59 | + bufs_len = vfio_state_buffers_size_get(&multifd->load_bufs); | 67 | + bufs_len = vfio_state_buffers_size_get(&multifd->load_bufs); |
60 | + if (multifd->load_buf_idx >= bufs_len) { | 68 | + if (multifd->load_buf_idx >= bufs_len) { |
61 | + assert(multifd->load_buf_idx == bufs_len); | 69 | + assert(multifd->load_buf_idx == bufs_len); |
62 | + return NULL; | 70 | + return NULL; |
... | ... | ||
105 | + errno_save = errno; | 113 | + errno_save = errno; |
106 | + qemu_mutex_lock(&multifd->load_bufs_mutex); | 114 | + qemu_mutex_lock(&multifd->load_bufs_mutex); |
107 | + | 115 | + |
108 | + if (wr_ret < 0) { | 116 | + if (wr_ret < 0) { |
109 | + error_setg(errp, | 117 | + error_setg(errp, |
110 | + "writing state buffer %" PRIu32 " failed: %d", | 118 | + "%s: writing state buffer %" PRIu32 " failed: %d", |
111 | + multifd->load_buf_idx, errno_save); | 119 | + vbasedev->name, multifd->load_buf_idx, errno_save); |
112 | + return false; | 120 | + return false; |
113 | + } | 121 | + } |
114 | + | 122 | + |
115 | + assert(wr_ret <= buf_len); | 123 | + assert(wr_ret <= buf_len); |
116 | + buf_len -= wr_ret; | 124 | + buf_len -= wr_ret; |
... | ... | ||
144 | +static bool vfio_load_bufs_thread(void *opaque, bool *should_quit, Error **errp) | 152 | +static bool vfio_load_bufs_thread(void *opaque, bool *should_quit, Error **errp) |
145 | +{ | 153 | +{ |
146 | + VFIODevice *vbasedev = opaque; | 154 | + VFIODevice *vbasedev = opaque; |
147 | + VFIOMigration *migration = vbasedev->migration; | 155 | + VFIOMigration *migration = vbasedev->migration; |
148 | + VFIOMultifd *multifd = migration->multifd; | 156 | + VFIOMultifd *multifd = migration->multifd; |
149 | + bool ret = true; | 157 | + bool ret = false; |
150 | + int config_ret; | 158 | + |
159 | + trace_vfio_load_bufs_thread_start(vbasedev->name); | ||
151 | + | 160 | + |
152 | + assert(multifd); | 161 | + assert(multifd); |
153 | + QEMU_LOCK_GUARD(&multifd->load_bufs_mutex); | 162 | + QEMU_LOCK_GUARD(&multifd->load_bufs_mutex); |
154 | + | 163 | + |
155 | + assert(multifd->load_bufs_thread_running); | 164 | + assert(multifd->load_bufs_thread_running); |
... | ... | ||
161 | + * Always check cancellation first after the buffer_ready wait below in | 170 | + * Always check cancellation first after the buffer_ready wait below in |
162 | + * case that cond was signalled by vfio_load_cleanup_load_bufs_thread(). | 171 | + * case that cond was signalled by vfio_load_cleanup_load_bufs_thread(). |
163 | + */ | 172 | + */ |
164 | + if (vfio_load_bufs_thread_want_exit(multifd, should_quit)) { | 173 | + if (vfio_load_bufs_thread_want_exit(multifd, should_quit)) { |
165 | + error_setg(errp, "operation cancelled"); | 174 | + error_setg(errp, "operation cancelled"); |
166 | + ret = false; | 175 | + goto thread_exit; |
167 | + goto ret_signal; | ||
168 | + } | 176 | + } |
169 | + | 177 | + |
170 | + assert(multifd->load_buf_idx <= multifd->load_buf_idx_last); | 178 | + assert(multifd->load_buf_idx <= multifd->load_buf_idx_last); |
171 | + | 179 | + |
172 | + lb = vfio_load_state_buffer_get(multifd); | 180 | + lb = vfio_load_state_buffer_get(multifd); |
... | ... | ||
185 | + if (multifd->load_buf_idx == 0) { | 193 | + if (multifd->load_buf_idx == 0) { |
186 | + trace_vfio_load_state_device_buffer_start(vbasedev->name); | 194 | + trace_vfio_load_state_device_buffer_start(vbasedev->name); |
187 | + } | 195 | + } |
188 | + | 196 | + |
189 | + if (!vfio_load_state_buffer_write(vbasedev, lb, errp)) { | 197 | + if (!vfio_load_state_buffer_write(vbasedev, lb, errp)) { |
190 | + ret = false; | 198 | + goto thread_exit; |
191 | + goto ret_signal; | ||
192 | + } | 199 | + } |
193 | + | 200 | + |
194 | + if (multifd->load_buf_idx == multifd->load_buf_idx_last - 1) { | 201 | + if (multifd->load_buf_idx == multifd->load_buf_idx_last - 1) { |
195 | + trace_vfio_load_state_device_buffer_end(vbasedev->name); | 202 | + trace_vfio_load_state_device_buffer_end(vbasedev->name); |
196 | + } | 203 | + } |
197 | + | 204 | + |
198 | + multifd->load_buf_idx++; | 205 | + multifd->load_buf_idx++; |
199 | + } | 206 | + } |
200 | + | 207 | + |
201 | + config_ret = vfio_load_bufs_thread_load_config(vbasedev); | 208 | + if (!vfio_load_bufs_thread_load_config(vbasedev, errp)) { |
202 | + if (config_ret) { | 209 | + goto thread_exit; |
203 | + error_setg(errp, "load config state failed: %d", config_ret); | 210 | + } |
204 | + ret = false; | 211 | + |
205 | + } | 212 | + ret = true; |
206 | + | 213 | + |
207 | +ret_signal: | 214 | +thread_exit: |
208 | + /* | 215 | + /* |
209 | + * Notify possibly waiting vfio_load_cleanup_load_bufs_thread() that | 216 | + * Notify possibly waiting vfio_load_cleanup_load_bufs_thread() that |
210 | + * this thread is exiting. | 217 | + * this thread is exiting. |
211 | + */ | 218 | + */ |
212 | + multifd->load_bufs_thread_running = false; | 219 | + multifd->load_bufs_thread_running = false; |
213 | + qemu_cond_signal(&multifd->load_bufs_thread_finished_cond); | 220 | + qemu_cond_signal(&multifd->load_bufs_thread_finished_cond); |
214 | + | 221 | + |
222 | + trace_vfio_load_bufs_thread_end(vbasedev->name); | ||
223 | + | ||
215 | + return ret; | 224 | + return ret; |
216 | +} | 225 | +} |
217 | + | 226 | + |
218 | VFIOMultifd *vfio_multifd_new(void) | 227 | static VFIOMultifd *vfio_multifd_new(void) |
219 | { | 228 | { |
220 | VFIOMultifd *multifd = g_new(VFIOMultifd, 1); | 229 | VFIOMultifd *multifd = g_new(VFIOMultifd, 1); |
221 | @@ -XXX,XX +XXX,XX @@ VFIOMultifd *vfio_multifd_new(void) | 230 | @@ -XXX,XX +XXX,XX @@ static VFIOMultifd *vfio_multifd_new(void) |
222 | multifd->load_buf_idx_last = UINT32_MAX; | 231 | multifd->load_buf_idx_last = UINT32_MAX; |
223 | qemu_cond_init(&multifd->load_bufs_buffer_ready_cond); | 232 | qemu_cond_init(&multifd->load_bufs_buffer_ready_cond); |
224 | 233 | ||
225 | + multifd->load_bufs_thread_running = false; | 234 | + multifd->load_bufs_thread_running = false; |
226 | + multifd->load_bufs_thread_want_exit = false; | 235 | + multifd->load_bufs_thread_want_exit = false; |
... | ... | ||
250 | + } | 259 | + } |
251 | + } | 260 | + } |
252 | + bql_lock(); | 261 | + bql_lock(); |
253 | +} | 262 | +} |
254 | + | 263 | + |
255 | void vfio_multifd_free(VFIOMultifd *multifd) | 264 | static void vfio_multifd_free(VFIOMultifd *multifd) |
256 | { | 265 | { |
257 | + vfio_load_cleanup_load_bufs_thread(multifd); | 266 | + vfio_load_cleanup_load_bufs_thread(multifd); |
258 | + | 267 | + |
259 | + qemu_cond_destroy(&multifd->load_bufs_thread_finished_cond); | 268 | + qemu_cond_destroy(&multifd->load_bufs_thread_finished_cond); |
260 | + vfio_state_buffers_destroy(&multifd->load_bufs); | 269 | vfio_state_buffers_destroy(&multifd->load_bufs); |
261 | qemu_cond_destroy(&multifd->load_bufs_buffer_ready_cond); | 270 | qemu_cond_destroy(&multifd->load_bufs_buffer_ready_cond); |
262 | qemu_mutex_destroy(&multifd->load_bufs_mutex); | 271 | qemu_mutex_destroy(&multifd->load_bufs_mutex); |
263 | 272 | @@ -XXX,XX +XXX,XX @@ bool vfio_multifd_setup(VFIODevice *vbasedev, bool alloc_multifd, Error **errp) | |
264 | @@ -XXX,XX +XXX,XX @@ bool vfio_multifd_transfer_setup(VFIODevice *vbasedev, Error **errp) | ||
265 | 273 | ||
266 | return true; | 274 | return true; |
267 | } | 275 | } |
268 | + | 276 | + |
269 | +int vfio_multifd_switchover_start(VFIODevice *vbasedev) | 277 | +int vfio_multifd_switchover_start(VFIODevice *vbasedev) |
... | ... | ||
287 | +} | 295 | +} |
288 | diff --git a/hw/vfio/migration-multifd.h b/hw/vfio/migration-multifd.h | 296 | diff --git a/hw/vfio/migration-multifd.h b/hw/vfio/migration-multifd.h |
289 | index XXXXXXX..XXXXXXX 100644 | 297 | index XXXXXXX..XXXXXXX 100644 |
290 | --- a/hw/vfio/migration-multifd.h | 298 | --- a/hw/vfio/migration-multifd.h |
291 | +++ b/hw/vfio/migration-multifd.h | 299 | +++ b/hw/vfio/migration-multifd.h |
292 | @@ -XXX,XX +XXX,XX @@ bool vfio_multifd_transfer_setup(VFIODevice *vbasedev, Error **errp); | 300 | @@ -XXX,XX +XXX,XX @@ bool vfio_multifd_transfer_enabled(VFIODevice *vbasedev); |
293 | bool vfio_load_state_buffer(void *opaque, char *data, size_t data_size, | 301 | bool vfio_multifd_load_state_buffer(void *opaque, char *data, size_t data_size, |
294 | Error **errp); | 302 | Error **errp); |
295 | 303 | ||
296 | +int vfio_multifd_switchover_start(VFIODevice *vbasedev); | 304 | +int vfio_multifd_switchover_start(VFIODevice *vbasedev); |
297 | + | 305 | + |
298 | #endif | 306 | #endif |
299 | diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c | 307 | diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c |
... | ... | ||
318 | static const SaveVMHandlers savevm_vfio_handlers = { | 326 | static const SaveVMHandlers savevm_vfio_handlers = { |
319 | .save_prepare = vfio_save_prepare, | 327 | .save_prepare = vfio_save_prepare, |
320 | .save_setup = vfio_save_setup, | 328 | .save_setup = vfio_save_setup, |
321 | @@ -XXX,XX +XXX,XX @@ static const SaveVMHandlers savevm_vfio_handlers = { | 329 | @@ -XXX,XX +XXX,XX @@ static const SaveVMHandlers savevm_vfio_handlers = { |
322 | .load_state = vfio_load_state, | 330 | .load_state = vfio_load_state, |
323 | .load_state_buffer = vfio_load_state_buffer, | 331 | .load_state_buffer = vfio_multifd_load_state_buffer, |
324 | .switchover_ack_needed = vfio_switchover_ack_needed, | 332 | .switchover_ack_needed = vfio_switchover_ack_needed, |
325 | + .switchover_start = vfio_switchover_start, | 333 | + .switchover_start = vfio_switchover_start, |
326 | }; | 334 | }; |
327 | 335 | ||
328 | /* ---------------------------------------------------------------------- */ | 336 | /* ---------------------------------------------------------------------- */ |
329 | diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events | 337 | diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events |
330 | index XXXXXXX..XXXXXXX 100644 | 338 | index XXXXXXX..XXXXXXX 100644 |
331 | --- a/hw/vfio/trace-events | 339 | --- a/hw/vfio/trace-events |
332 | +++ b/hw/vfio/trace-events | 340 | +++ b/hw/vfio/trace-events |
333 | @@ -XXX,XX +XXX,XX @@ vfio_load_device_config_state_end(const char *name) " (%s)" | 341 | @@ -XXX,XX +XXX,XX @@ vfio_display_edid_update(uint32_t prefx, uint32_t prefy) "%ux%u" |
342 | vfio_display_edid_write_error(void) "" | ||
343 | |||
344 | # migration.c | ||
345 | +vfio_load_bufs_thread_start(const char *name) " (%s)" | ||
346 | +vfio_load_bufs_thread_end(const char *name) " (%s)" | ||
347 | vfio_load_cleanup(const char *name) " (%s)" | ||
348 | vfio_load_device_config_state_start(const char *name) " (%s)" | ||
349 | vfio_load_device_config_state_end(const char *name) " (%s)" | ||
334 | vfio_load_state(const char *name, uint64_t data) " (%s) data 0x%"PRIx64 | 350 | vfio_load_state(const char *name, uint64_t data) " (%s) data 0x%"PRIx64 |
335 | vfio_load_state_device_data(const char *name, uint64_t data_size, int ret) " (%s) size %"PRIu64" ret %d" | 351 | vfio_load_state_device_data(const char *name, uint64_t data_size, int ret) " (%s) size %"PRIu64" ret %d" |
336 | vfio_load_state_device_buffer_incoming(const char *name, uint32_t idx) " (%s) idx %"PRIu32 | 352 | vfio_load_state_device_buffer_incoming(const char *name, uint32_t idx) " (%s) idx %"PRIu32 |
337 | +vfio_load_state_device_buffer_start(const char *name) " (%s)" | 353 | +vfio_load_state_device_buffer_start(const char *name) " (%s)" |
338 | +vfio_load_state_device_buffer_starved(const char *name, uint32_t idx) " (%s) idx %"PRIu32 | 354 | +vfio_load_state_device_buffer_starved(const char *name, uint32_t idx) " (%s) idx %"PRIu32 |
339 | +vfio_load_state_device_buffer_load_start(const char *name, uint32_t idx) " (%s) idx %"PRIu32 | 355 | +vfio_load_state_device_buffer_load_start(const char *name, uint32_t idx) " (%s) idx %"PRIu32 |
340 | +vfio_load_state_device_buffer_load_end(const char *name, uint32_t idx) " (%s) idx %"PRIu32 | 356 | +vfio_load_state_device_buffer_load_end(const char *name, uint32_t idx) " (%s) idx %"PRIu32 |
341 | +vfio_load_state_device_buffer_end(const char *name) " (%s)" | 357 | +vfio_load_state_device_buffer_end(const char *name) " (%s)" |
342 | vfio_migration_realize(const char *name) " (%s)" | 358 | vfio_migration_realize(const char *name) " (%s)" |
343 | vfio_migration_set_device_state(const char *name, const char *state) " (%s) state %s" | 359 | vfio_migration_set_device_state(const char *name, const char *state) " (%s) state %s" |
344 | vfio_migration_set_state(const char *name, const char *new_state, const char *recover_state) " (%s) new state %s, recover state %s" | 360 | vfio_migration_set_state(const char *name, const char *new_state, const char *recover_state) " (%s) new state %s, recover state %s" | diff view generated by jsdifflib |
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | 1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> |
---|---|---|---|
2 | 2 | ||
3 | Automatic memory management helps avoid memory safety issues. | 3 | Automatic memory management helps avoid memory safety issues. |
4 | 4 | ||
5 | Reviewed-by: Fabiano Rosas <farosas@suse.de> | 5 | Reviewed-by: Fabiano Rosas <farosas@suse.de> |
6 | Reviewed-by: Peter Xu <peterx@redhat.com> | 6 | Reviewed-by: Peter Xu <peterx@redhat.com> |
7 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | 7 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> |
8 | --- | 8 | --- |
9 | migration/qemu-file.h | 2 ++ | 9 | migration/qemu-file.h | 2 ++ |
10 | 1 file changed, 2 insertions(+) | 10 | 1 file changed, 2 insertions(+) |
11 | 11 | ||
12 | diff --git a/migration/qemu-file.h b/migration/qemu-file.h | 12 | diff --git a/migration/qemu-file.h b/migration/qemu-file.h |
13 | index XXXXXXX..XXXXXXX 100644 | 13 | index XXXXXXX..XXXXXXX 100644 |
14 | --- a/migration/qemu-file.h | 14 | --- a/migration/qemu-file.h |
15 | +++ b/migration/qemu-file.h | 15 | +++ b/migration/qemu-file.h |
16 | @@ -XXX,XX +XXX,XX @@ QEMUFile *qemu_file_new_input(QIOChannel *ioc); | 16 | @@ -XXX,XX +XXX,XX @@ QEMUFile *qemu_file_new_input(QIOChannel *ioc); |
17 | QEMUFile *qemu_file_new_output(QIOChannel *ioc); | 17 | QEMUFile *qemu_file_new_output(QIOChannel *ioc); |
18 | int qemu_fclose(QEMUFile *f); | 18 | int qemu_fclose(QEMUFile *f); |
19 | 19 | ||
20 | +G_DEFINE_AUTOPTR_CLEANUP_FUNC(QEMUFile, qemu_fclose) | 20 | +G_DEFINE_AUTOPTR_CLEANUP_FUNC(QEMUFile, qemu_fclose) |
21 | + | 21 | + |
22 | /* | 22 | /* |
23 | * qemu_file_transferred: | 23 | * qemu_file_transferred: |
24 | * | 24 | * | diff view generated by jsdifflib |
... | ... | ||
---|---|---|---|
5 | 5 | ||
6 | Also, make sure to process the relevant main migration channel flags. | 6 | Also, make sure to process the relevant main migration channel flags. |
7 | 7 | ||
8 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | 8 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> |
9 | --- | 9 | --- |
10 | hw/vfio/migration-multifd.c | 47 ++++++++++++++++++++++++++++++++++- | 10 | hw/vfio/migration-multifd.c | 49 +++++++++++++++++++++++++++++++++-- |
11 | hw/vfio/migration.c | 8 +++++- | 11 | hw/vfio/migration.c | 9 ++++++- |
12 | include/hw/vfio/vfio-common.h | 2 ++ | 12 | include/hw/vfio/vfio-common.h | 2 ++ |
13 | 3 files changed, 55 insertions(+), 2 deletions(-) | 13 | 3 files changed, 57 insertions(+), 3 deletions(-) |
14 | 14 | ||
15 | diff --git a/hw/vfio/migration-multifd.c b/hw/vfio/migration-multifd.c | 15 | diff --git a/hw/vfio/migration-multifd.c b/hw/vfio/migration-multifd.c |
16 | index XXXXXXX..XXXXXXX 100644 | 16 | index XXXXXXX..XXXXXXX 100644 |
17 | --- a/hw/vfio/migration-multifd.c | 17 | --- a/hw/vfio/migration-multifd.c |
18 | +++ b/hw/vfio/migration-multifd.c | 18 | +++ b/hw/vfio/migration-multifd.c |
... | ... | ||
22 | #include "qemu/thread.h" | 22 | #include "qemu/thread.h" |
23 | +#include "io/channel-buffer.h" | 23 | +#include "io/channel-buffer.h" |
24 | #include "migration/qemu-file.h" | 24 | #include "migration/qemu-file.h" |
25 | #include "migration-multifd.h" | 25 | #include "migration-multifd.h" |
26 | #include "trace.h" | 26 | #include "trace.h" |
27 | @@ -XXX,XX +XXX,XX @@ bool vfio_load_state_buffer(void *opaque, char *data, size_t data_size, | 27 | @@ -XXX,XX +XXX,XX @@ bool vfio_multifd_load_state_buffer(void *opaque, char *data, size_t data_size, |
28 | 28 | static bool vfio_load_bufs_thread_load_config(VFIODevice *vbasedev, | |
29 | static int vfio_load_bufs_thread_load_config(VFIODevice *vbasedev) | 29 | Error **errp) |
30 | { | 30 | { |
31 | - return -EINVAL; | 31 | - error_setg(errp, "not yet there"); |
32 | - return false; | ||
32 | + VFIOMigration *migration = vbasedev->migration; | 33 | + VFIOMigration *migration = vbasedev->migration; |
33 | + VFIOMultifd *multifd = migration->multifd; | 34 | + VFIOMultifd *multifd = migration->multifd; |
34 | + VFIOStateBuffer *lb; | 35 | + VFIOStateBuffer *lb; |
35 | + g_autoptr(QIOChannelBuffer) bioc = NULL; | 36 | + g_autoptr(QIOChannelBuffer) bioc = NULL; |
36 | + QEMUFile *f_out = NULL, *f_in = NULL; | 37 | + g_autoptr(QEMUFile) f_out = NULL, f_in = NULL; |
37 | + uint64_t mig_header; | 38 | + uint64_t mig_header; |
38 | + int ret; | 39 | + int ret; |
39 | + | 40 | + |
40 | + assert(multifd->load_buf_idx == multifd->load_buf_idx_last); | 41 | + assert(multifd->load_buf_idx == multifd->load_buf_idx_last); |
41 | + lb = vfio_state_buffers_at(&multifd->load_bufs, multifd->load_buf_idx); | 42 | + lb = vfio_state_buffers_at(&multifd->load_bufs, multifd->load_buf_idx); |
... | ... | ||
47 | + f_out = qemu_file_new_output(QIO_CHANNEL(bioc)); | 48 | + f_out = qemu_file_new_output(QIO_CHANNEL(bioc)); |
48 | + qemu_put_buffer(f_out, (uint8_t *)lb->data, lb->len); | 49 | + qemu_put_buffer(f_out, (uint8_t *)lb->data, lb->len); |
49 | + | 50 | + |
50 | + ret = qemu_fflush(f_out); | 51 | + ret = qemu_fflush(f_out); |
51 | + if (ret) { | 52 | + if (ret) { |
52 | + g_clear_pointer(&f_out, qemu_fclose); | 53 | + error_setg(errp, "%s: load config state flush failed: %d", |
53 | + return ret; | 54 | + vbasedev->name, ret); |
55 | + return false; | ||
54 | + } | 56 | + } |
55 | + | 57 | + |
56 | + qio_channel_io_seek(QIO_CHANNEL(bioc), 0, 0, NULL); | 58 | + qio_channel_io_seek(QIO_CHANNEL(bioc), 0, 0, NULL); |
57 | + f_in = qemu_file_new_input(QIO_CHANNEL(bioc)); | 59 | + f_in = qemu_file_new_input(QIO_CHANNEL(bioc)); |
58 | + | 60 | + |
59 | + mig_header = qemu_get_be64(f_in); | 61 | + mig_header = qemu_get_be64(f_in); |
60 | + if (mig_header != VFIO_MIG_FLAG_DEV_CONFIG_STATE) { | 62 | + if (mig_header != VFIO_MIG_FLAG_DEV_CONFIG_STATE) { |
61 | + g_clear_pointer(&f_out, qemu_fclose); | 63 | + error_setg(errp, "%s: expected FLAG_DEV_CONFIG_STATE but got %" PRIx64, |
62 | + g_clear_pointer(&f_in, qemu_fclose); | 64 | + vbasedev->name, mig_header); |
63 | + return -EINVAL; | 65 | + return false; |
64 | + } | 66 | + } |
65 | + | 67 | + |
66 | + bql_lock(); | 68 | + bql_lock(); |
67 | + ret = vfio_load_device_config_state(f_in, vbasedev); | 69 | + ret = vfio_load_device_config_state(f_in, vbasedev); |
68 | + bql_unlock(); | 70 | + bql_unlock(); |
69 | + | 71 | + |
70 | + g_clear_pointer(&f_out, qemu_fclose); | ||
71 | + g_clear_pointer(&f_in, qemu_fclose); | ||
72 | + if (ret < 0) { | 72 | + if (ret < 0) { |
73 | + return ret; | 73 | + error_setg(errp, "%s: vfio_load_device_config_state() failed: %d", |
74 | + vbasedev->name, ret); | ||
75 | + return false; | ||
74 | + } | 76 | + } |
75 | + | 77 | + |
76 | + return 0; | 78 | + return true; |
77 | } | 79 | } |
78 | 80 | ||
79 | static VFIOStateBuffer *vfio_load_state_buffer_get(VFIOMultifd *multifd) | 81 | static VFIOStateBuffer *vfio_load_state_buffer_get(VFIOMultifd *multifd) |
80 | diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c | 82 | diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c |
81 | index XXXXXXX..XXXXXXX 100644 | 83 | index XXXXXXX..XXXXXXX 100644 |
... | ... | ||
93 | @@ -XXX,XX +XXX,XX @@ static int vfio_load_state(QEMUFile *f, void *opaque, int version_id) | 95 | @@ -XXX,XX +XXX,XX @@ static int vfio_load_state(QEMUFile *f, void *opaque, int version_id) |
94 | switch (data) { | 96 | switch (data) { |
95 | case VFIO_MIG_FLAG_DEV_CONFIG_STATE: | 97 | case VFIO_MIG_FLAG_DEV_CONFIG_STATE: |
96 | { | 98 | { |
97 | + if (vfio_multifd_transfer_enabled(vbasedev)) { | 99 | + if (vfio_multifd_transfer_enabled(vbasedev)) { |
98 | + error_report("%s: got DEV_CONFIG_STATE but doing multifd transfer", | 100 | + error_report("%s: got DEV_CONFIG_STATE in main migration " |
101 | + "channel but doing multifd transfer", | ||
99 | + vbasedev->name); | 102 | + vbasedev->name); |
100 | + return -EINVAL; | 103 | + return -EINVAL; |
101 | + } | 104 | + } |
102 | + | 105 | + |
103 | return vfio_load_device_config_state(f, opaque); | 106 | return vfio_load_device_config_state(f, opaque); |
104 | } | 107 | } |
105 | case VFIO_MIG_FLAG_DEV_SETUP_STATE: | 108 | case VFIO_MIG_FLAG_DEV_SETUP_STATE: |
106 | diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h | 109 | diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h |
107 | index XXXXXXX..XXXXXXX 100644 | 110 | index XXXXXXX..XXXXXXX 100644 |
108 | --- a/include/hw/vfio/vfio-common.h | 111 | --- a/include/hw/vfio/vfio-common.h |
109 | +++ b/include/hw/vfio/vfio-common.h | 112 | +++ b/include/hw/vfio/vfio-common.h |
110 | @@ -XXX,XX +XXX,XX @@ void vfio_add_bytes_transferred(unsigned long val); | 113 | @@ -XXX,XX +XXX,XX @@ void vfio_mig_add_bytes_transferred(unsigned long val); |
111 | bool vfio_device_state_is_running(VFIODevice *vbasedev); | 114 | bool vfio_device_state_is_running(VFIODevice *vbasedev); |
112 | bool vfio_device_state_is_precopy(VFIODevice *vbasedev); | 115 | bool vfio_device_state_is_precopy(VFIODevice *vbasedev); |
113 | 116 | ||
114 | +int vfio_load_device_config_state(QEMUFile *f, void *opaque); | 117 | +int vfio_load_device_config_state(QEMUFile *f, void *opaque); |
115 | + | 118 | + |
116 | #ifdef CONFIG_LINUX | 119 | #ifdef CONFIG_LINUX |
117 | int vfio_get_region_info(VFIODevice *vbasedev, int index, | 120 | int vfio_get_region_info(VFIODevice *vbasedev, int index, |
118 | struct vfio_region_info **info); | 121 | struct vfio_region_info **info); | diff view generated by jsdifflib |
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | 1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> |
---|---|---|---|
2 | 2 | ||
3 | Implement the multifd device state transfer via additional per-device | 3 | Implement the multifd device state transfer via additional per-device |
4 | thread inside save_live_complete_precopy_thread handler. | 4 | thread inside save_live_complete_precopy_thread handler. |
5 | 5 | ||
6 | Switch between doing the data transfer in the new handler and doing it | 6 | Switch between doing the data transfer in the new handler and doing it |
7 | in the old save_state handler depending on the | 7 | in the old save_state handler depending if VFIO multifd transfer is enabled |
8 | x-migration-multifd-transfer device property value. | 8 | or not. |
9 | 9 | ||
10 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | 10 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> |
11 | --- | 11 | --- |
12 | hw/vfio/migration-multifd.c | 139 ++++++++++++++++++++++++++++++++++ | 12 | hw/vfio/migration-multifd.c | 142 ++++++++++++++++++++++++++++++++++ |
13 | hw/vfio/migration-multifd.h | 5 ++ | 13 | hw/vfio/migration-multifd.h | 6 ++ |
14 | hw/vfio/migration.c | 26 +++++-- | 14 | hw/vfio/migration.c | 22 ++++-- |
15 | hw/vfio/trace-events | 2 + | 15 | hw/vfio/trace-events | 2 + |
16 | include/hw/vfio/vfio-common.h | 8 ++ | 16 | include/hw/vfio/vfio-common.h | 6 ++ |
17 | 5 files changed, 174 insertions(+), 6 deletions(-) | 17 | 5 files changed, 172 insertions(+), 6 deletions(-) |
18 | 18 | ||
19 | diff --git a/hw/vfio/migration-multifd.c b/hw/vfio/migration-multifd.c | 19 | diff --git a/hw/vfio/migration-multifd.c b/hw/vfio/migration-multifd.c |
20 | index XXXXXXX..XXXXXXX 100644 | 20 | index XXXXXXX..XXXXXXX 100644 |
21 | --- a/hw/vfio/migration-multifd.c | 21 | --- a/hw/vfio/migration-multifd.c |
22 | +++ b/hw/vfio/migration-multifd.c | 22 | +++ b/hw/vfio/migration-multifd.c |
23 | @@ -XXX,XX +XXX,XX @@ bool vfio_multifd_transfer_setup(VFIODevice *vbasedev, Error **errp) | 23 | @@ -XXX,XX +XXX,XX @@ bool vfio_multifd_setup(VFIODevice *vbasedev, bool alloc_multifd, Error **errp) |
24 | return true; | 24 | return true; |
25 | } | 25 | } |
26 | 26 | ||
27 | +void vfio_multifd_emit_dummy_eos(VFIODevice *vbasedev, QEMUFile *f) | 27 | +void vfio_multifd_emit_dummy_eos(VFIODevice *vbasedev, QEMUFile *f) |
28 | +{ | 28 | +{ |
... | ... | ||
57 | + return false; | 57 | + return false; |
58 | + } | 58 | + } |
59 | + | 59 | + |
60 | + ret = qemu_fflush(f); | 60 | + ret = qemu_fflush(f); |
61 | + if (ret) { | 61 | + if (ret) { |
62 | + error_setg(errp, "save config state flush failed: %d", ret); | 62 | + error_setg(errp, "%s: save config state flush failed: %d", |
63 | + vbasedev->name, ret); | ||
63 | + return false; | 64 | + return false; |
64 | + } | 65 | + } |
65 | + | 66 | + |
66 | + packet_len = sizeof(*packet) + bioc->usage; | 67 | + packet_len = sizeof(*packet) + bioc->usage; |
67 | + packet = g_malloc0(packet_len); | 68 | + packet = g_malloc0(packet_len); |
... | ... | ||
70 | + packet->flags = VFIO_DEVICE_STATE_CONFIG_STATE; | 71 | + packet->flags = VFIO_DEVICE_STATE_CONFIG_STATE; |
71 | + memcpy(&packet->data, bioc->data, bioc->usage); | 72 | + memcpy(&packet->data, bioc->data, bioc->usage); |
72 | + | 73 | + |
73 | + if (!multifd_queue_device_state(idstr, instance_id, | 74 | + if (!multifd_queue_device_state(idstr, instance_id, |
74 | + (char *)packet, packet_len)) { | 75 | + (char *)packet, packet_len)) { |
75 | + error_setg(errp, "multifd config data queuing failed"); | 76 | + error_setg(errp, "%s: multifd config data queuing failed", |
77 | + vbasedev->name); | ||
76 | + return false; | 78 | + return false; |
77 | + } | 79 | + } |
78 | + | 80 | + |
79 | + vfio_add_bytes_transferred(packet_len); | 81 | + vfio_mig_add_bytes_transferred(packet_len); |
80 | + | 82 | + |
81 | + return true; | 83 | + return true; |
82 | +} | 84 | +} |
83 | + | 85 | + |
84 | +/* | 86 | +/* |
... | ... | ||
89 | + * * completing saving the remaining device state and device config, OR: | 91 | + * * completing saving the remaining device state and device config, OR: |
90 | + * * encountering some error while doing the above, OR: | 92 | + * * encountering some error while doing the above, OR: |
91 | + * * being forcefully aborted by the migration core by | 93 | + * * being forcefully aborted by the migration core by |
92 | + * multifd_device_state_save_thread_should_exit() returning true. | 94 | + * multifd_device_state_save_thread_should_exit() returning true. |
93 | + */ | 95 | + */ |
94 | +bool vfio_save_complete_precopy_thread(SaveLiveCompletePrecopyThreadData *d, | 96 | +bool |
95 | + Error **errp) | 97 | +vfio_multifd_save_complete_precopy_thread(SaveLiveCompletePrecopyThreadData *d, |
98 | + Error **errp) | ||
96 | +{ | 99 | +{ |
97 | + VFIODevice *vbasedev = d->handler_opaque; | 100 | + VFIODevice *vbasedev = d->handler_opaque; |
98 | + VFIOMigration *migration = vbasedev->migration; | 101 | + VFIOMigration *migration = vbasedev->migration; |
99 | + bool ret; | 102 | + bool ret = false; |
100 | + g_autofree VFIODeviceStatePacket *packet = NULL; | 103 | + g_autofree VFIODeviceStatePacket *packet = NULL; |
101 | + uint32_t idx; | 104 | + uint32_t idx; |
102 | + | 105 | + |
103 | + if (!vfio_multifd_transfer_enabled(vbasedev)) { | 106 | + if (!vfio_multifd_transfer_enabled(vbasedev)) { |
104 | + /* Nothing to do, vfio_save_complete_precopy() does the transfer. */ | 107 | + /* Nothing to do, vfio_save_complete_precopy() does the transfer. */ |
... | ... | ||
109 | + d->idstr, d->instance_id); | 112 | + d->idstr, d->instance_id); |
110 | + | 113 | + |
111 | + /* We reach here with device state STOP or STOP_COPY only */ | 114 | + /* We reach here with device state STOP or STOP_COPY only */ |
112 | + if (vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_STOP_COPY, | 115 | + if (vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_STOP_COPY, |
113 | + VFIO_DEVICE_STATE_STOP, errp)) { | 116 | + VFIO_DEVICE_STATE_STOP, errp)) { |
114 | + ret = false; | 117 | + goto thread_exit; |
115 | + goto ret_finish; | ||
116 | + } | 118 | + } |
117 | + | 119 | + |
118 | + packet = g_malloc0(sizeof(*packet) + migration->data_buffer_size); | 120 | + packet = g_malloc0(sizeof(*packet) + migration->data_buffer_size); |
119 | + packet->version = VFIO_DEVICE_STATE_PACKET_VER_CURRENT; | 121 | + packet->version = VFIO_DEVICE_STATE_PACKET_VER_CURRENT; |
120 | + | 122 | + |
121 | + for (idx = 0; ; idx++) { | 123 | + for (idx = 0; ; idx++) { |
122 | + ssize_t data_size; | 124 | + ssize_t data_size; |
123 | + size_t packet_size; | 125 | + size_t packet_size; |
124 | + | 126 | + |
125 | + if (multifd_device_state_save_thread_should_exit()) { | 127 | + if (multifd_device_state_save_thread_should_exit()) { |
126 | + error_setg(errp, "operation cancelled"); | 128 | + error_setg(errp, "operation cancelled"); |
127 | + ret = false; | 129 | + goto thread_exit; |
128 | + goto ret_finish; | ||
129 | + } | 130 | + } |
130 | + | 131 | + |
131 | + data_size = read(migration->data_fd, &packet->data, | 132 | + data_size = read(migration->data_fd, &packet->data, |
132 | + migration->data_buffer_size); | 133 | + migration->data_buffer_size); |
133 | + if (data_size < 0) { | 134 | + if (data_size < 0) { |
134 | + error_setg(errp, "reading state buffer %" PRIu32 " failed: %d", | 135 | + error_setg(errp, "%s: reading state buffer %" PRIu32 " failed: %d", |
135 | + idx, errno); | 136 | + vbasedev->name, idx, errno); |
136 | + ret = false; | 137 | + goto thread_exit; |
137 | + goto ret_finish; | ||
138 | + } else if (data_size == 0) { | 138 | + } else if (data_size == 0) { |
139 | + break; | 139 | + break; |
140 | + } | 140 | + } |
141 | + | 141 | + |
142 | + packet->idx = idx; | 142 | + packet->idx = idx; |
143 | + packet_size = sizeof(*packet) + data_size; | 143 | + packet_size = sizeof(*packet) + data_size; |
144 | + | 144 | + |
145 | + if (!multifd_queue_device_state(d->idstr, d->instance_id, | 145 | + if (!multifd_queue_device_state(d->idstr, d->instance_id, |
146 | + (char *)packet, packet_size)) { | 146 | + (char *)packet, packet_size)) { |
147 | + error_setg(errp, "multifd data queuing failed"); | 147 | + error_setg(errp, "%s: multifd data queuing failed", vbasedev->name); |
148 | + ret = false; | 148 | + goto thread_exit; |
149 | + goto ret_finish; | ||
150 | + } | 149 | + } |
151 | + | 150 | + |
152 | + vfio_add_bytes_transferred(packet_size); | 151 | + vfio_mig_add_bytes_transferred(packet_size); |
153 | + } | 152 | + } |
154 | + | 153 | + |
155 | + ret = vfio_save_complete_precopy_thread_config_state(vbasedev, | 154 | + if (!vfio_save_complete_precopy_thread_config_state(vbasedev, |
156 | + d->idstr, | 155 | + d->idstr, |
157 | + d->instance_id, | 156 | + d->instance_id, |
158 | + idx, errp); | 157 | + idx, errp)) { |
159 | + | 158 | + goto thread_exit; |
160 | +ret_finish: | 159 | + } |
160 | + | ||
161 | + ret = true; | ||
162 | + | ||
163 | +thread_exit: | ||
161 | + trace_vfio_save_complete_precopy_thread_end(vbasedev->name, ret); | 164 | + trace_vfio_save_complete_precopy_thread_end(vbasedev->name, ret); |
162 | + | 165 | + |
163 | + return ret; | 166 | + return ret; |
164 | +} | 167 | +} |
165 | + | 168 | + |
... | ... | ||
168 | VFIOMigration *migration = vbasedev->migration; | 171 | VFIOMigration *migration = vbasedev->migration; |
169 | diff --git a/hw/vfio/migration-multifd.h b/hw/vfio/migration-multifd.h | 172 | diff --git a/hw/vfio/migration-multifd.h b/hw/vfio/migration-multifd.h |
170 | index XXXXXXX..XXXXXXX 100644 | 173 | index XXXXXXX..XXXXXXX 100644 |
171 | --- a/hw/vfio/migration-multifd.h | 174 | --- a/hw/vfio/migration-multifd.h |
172 | +++ b/hw/vfio/migration-multifd.h | 175 | +++ b/hw/vfio/migration-multifd.h |
173 | @@ -XXX,XX +XXX,XX @@ bool vfio_multifd_transfer_setup(VFIODevice *vbasedev, Error **errp); | 176 | @@ -XXX,XX +XXX,XX @@ bool vfio_multifd_transfer_enabled(VFIODevice *vbasedev); |
174 | bool vfio_load_state_buffer(void *opaque, char *data, size_t data_size, | 177 | bool vfio_multifd_load_state_buffer(void *opaque, char *data, size_t data_size, |
175 | Error **errp); | 178 | Error **errp); |
176 | 179 | ||
177 | +void vfio_multifd_emit_dummy_eos(VFIODevice *vbasedev, QEMUFile *f); | 180 | +void vfio_multifd_emit_dummy_eos(VFIODevice *vbasedev, QEMUFile *f); |
178 | + | 181 | + |
179 | +bool vfio_save_complete_precopy_thread(SaveLiveCompletePrecopyThreadData *d, | 182 | +bool |
180 | + Error **errp); | 183 | +vfio_multifd_save_complete_precopy_thread(SaveLiveCompletePrecopyThreadData *d, |
184 | + Error **errp); | ||
181 | + | 185 | + |
182 | int vfio_multifd_switchover_start(VFIODevice *vbasedev); | 186 | int vfio_multifd_switchover_start(VFIODevice *vbasedev); |
183 | 187 | ||
184 | #endif | 188 | #endif |
185 | diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c | 189 | diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c |
... | ... | ||
209 | - Error **errp) | 213 | - Error **errp) |
210 | +int vfio_save_device_config_state(QEMUFile *f, void *opaque, Error **errp) | 214 | +int vfio_save_device_config_state(QEMUFile *f, void *opaque, Error **errp) |
211 | { | 215 | { |
212 | VFIODevice *vbasedev = opaque; | 216 | VFIODevice *vbasedev = opaque; |
213 | int ret; | 217 | int ret; |
214 | @@ -XXX,XX +XXX,XX @@ static int vfio_save_setup(QEMUFile *f, void *opaque, Error **errp) | ||
215 | uint64_t stop_copy_size = VFIO_MIG_DEFAULT_DATA_BUFFER_SIZE; | ||
216 | int ret; | ||
217 | |||
218 | + if (!vfio_multifd_transfer_setup(vbasedev, errp)) { | ||
219 | + return -EINVAL; | ||
220 | + } | ||
221 | + | ||
222 | qemu_put_be64(f, VFIO_MIG_FLAG_DEV_SETUP_STATE); | ||
223 | |||
224 | vfio_query_stop_copy_size(vbasedev, &stop_copy_size); | ||
225 | @@ -XXX,XX +XXX,XX @@ static int vfio_save_complete_precopy(QEMUFile *f, void *opaque) | 218 | @@ -XXX,XX +XXX,XX @@ static int vfio_save_complete_precopy(QEMUFile *f, void *opaque) |
226 | int ret; | 219 | int ret; |
227 | Error *local_err = NULL; | 220 | Error *local_err = NULL; |
228 | 221 | ||
229 | + if (vfio_multifd_transfer_enabled(vbasedev)) { | 222 | + if (vfio_multifd_transfer_enabled(vbasedev)) { |
... | ... | ||
248 | error_prepend(&local_err, | 241 | error_prepend(&local_err, |
249 | @@ -XXX,XX +XXX,XX @@ static const SaveVMHandlers savevm_vfio_handlers = { | 242 | @@ -XXX,XX +XXX,XX @@ static const SaveVMHandlers savevm_vfio_handlers = { |
250 | .is_active_iterate = vfio_is_active_iterate, | 243 | .is_active_iterate = vfio_is_active_iterate, |
251 | .save_live_iterate = vfio_save_iterate, | 244 | .save_live_iterate = vfio_save_iterate, |
252 | .save_live_complete_precopy = vfio_save_complete_precopy, | 245 | .save_live_complete_precopy = vfio_save_complete_precopy, |
253 | + .save_live_complete_precopy_thread = vfio_save_complete_precopy_thread, | 246 | + .save_live_complete_precopy_thread = vfio_multifd_save_complete_precopy_thread, |
254 | .save_state = vfio_save_state, | 247 | .save_state = vfio_save_state, |
255 | .load_setup = vfio_load_setup, | 248 | .load_setup = vfio_load_setup, |
256 | .load_cleanup = vfio_load_cleanup, | 249 | .load_cleanup = vfio_load_cleanup, |
257 | diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events | 250 | diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events |
258 | index XXXXXXX..XXXXXXX 100644 | 251 | index XXXXXXX..XXXXXXX 100644 |
... | ... | ||
269 | vfio_save_iterate_start(const char *name) " (%s)" | 262 | vfio_save_iterate_start(const char *name) " (%s)" |
270 | diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h | 263 | diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h |
271 | index XXXXXXX..XXXXXXX 100644 | 264 | index XXXXXXX..XXXXXXX 100644 |
272 | --- a/include/hw/vfio/vfio-common.h | 265 | --- a/include/hw/vfio/vfio-common.h |
273 | +++ b/include/hw/vfio/vfio-common.h | 266 | +++ b/include/hw/vfio/vfio-common.h |
274 | @@ -XXX,XX +XXX,XX @@ void vfio_add_bytes_transferred(unsigned long val); | 267 | @@ -XXX,XX +XXX,XX @@ void vfio_mig_add_bytes_transferred(unsigned long val); |
275 | bool vfio_device_state_is_running(VFIODevice *vbasedev); | 268 | bool vfio_device_state_is_running(VFIODevice *vbasedev); |
276 | bool vfio_device_state_is_precopy(VFIODevice *vbasedev); | 269 | bool vfio_device_state_is_precopy(VFIODevice *vbasedev); |
277 | 270 | ||
278 | +#ifdef CONFIG_LINUX | 271 | +int vfio_save_device_config_state(QEMUFile *f, void *opaque, Error **errp); |
272 | int vfio_load_device_config_state(QEMUFile *f, void *opaque); | ||
273 | |||
274 | #ifdef CONFIG_LINUX | ||
275 | @@ -XXX,XX +XXX,XX @@ struct vfio_info_cap_header * | ||
276 | vfio_get_device_info_cap(struct vfio_device_info *info, uint16_t id); | ||
277 | struct vfio_info_cap_header * | ||
278 | vfio_get_cap(void *ptr, uint32_t cap_offset, uint16_t id); | ||
279 | + | ||
279 | +int vfio_migration_set_state(VFIODevice *vbasedev, | 280 | +int vfio_migration_set_state(VFIODevice *vbasedev, |
280 | + enum vfio_device_mig_state new_state, | 281 | + enum vfio_device_mig_state new_state, |
281 | + enum vfio_device_mig_state recover_state, | 282 | + enum vfio_device_mig_state recover_state, |
282 | + Error **errp); | 283 | + Error **errp); |
283 | +#endif | 284 | #endif |
284 | + | 285 | |
285 | +int vfio_save_device_config_state(QEMUFile *f, void *opaque, Error **errp); | 286 | bool vfio_migration_realize(VFIODevice *vbasedev, Error **errp); |
286 | int vfio_load_device_config_state(QEMUFile *f, void *opaque); | ||
287 | |||
288 | #ifdef CONFIG_LINUX | diff view generated by jsdifflib |
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | 1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> |
---|---|---|---|
2 | 2 | ||
3 | This property allows configuring at runtime whether to transfer the | 3 | This property allows configuring whether to transfer the particular device |
4 | particular device state via multifd channels when live migrating that | 4 | state via multifd channels when live migrating that device. |
5 | device. | ||
6 | 5 | ||
7 | It defaults to AUTO, which means that VFIO device state transfer via | 6 | It defaults to AUTO, which means that VFIO device state transfer via |
8 | multifd channels is attempted in configurations that otherwise support it. | 7 | multifd channels is attempted in configurations that otherwise support it. |
9 | 8 | ||
10 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | 9 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> |
11 | --- | 10 | --- |
12 | hw/vfio/migration-multifd.c | 17 ++++++++++++++++- | 11 | hw/vfio/migration-multifd.c | 18 +++++++++++++++++- |
13 | hw/vfio/pci.c | 3 +++ | 12 | hw/vfio/pci.c | 8 ++++++++ |
14 | include/hw/vfio/vfio-common.h | 2 ++ | 13 | include/hw/vfio/vfio-common.h | 2 ++ |
15 | 3 files changed, 21 insertions(+), 1 deletion(-) | 14 | 3 files changed, 27 insertions(+), 1 deletion(-) |
16 | 15 | ||
17 | diff --git a/hw/vfio/migration-multifd.c b/hw/vfio/migration-multifd.c | 16 | diff --git a/hw/vfio/migration-multifd.c b/hw/vfio/migration-multifd.c |
18 | index XXXXXXX..XXXXXXX 100644 | 17 | index XXXXXXX..XXXXXXX 100644 |
19 | --- a/hw/vfio/migration-multifd.c | 18 | --- a/hw/vfio/migration-multifd.c |
20 | +++ b/hw/vfio/migration-multifd.c | 19 | +++ b/hw/vfio/migration-multifd.c |
... | ... | ||
26 | + VFIOMigration *migration = vbasedev->migration; | 25 | + VFIOMigration *migration = vbasedev->migration; |
27 | + | 26 | + |
28 | + return migration->multifd_transfer; | 27 | + return migration->multifd_transfer; |
29 | } | 28 | } |
30 | 29 | ||
31 | bool vfio_multifd_transfer_setup(VFIODevice *vbasedev, Error **errp) | 30 | bool vfio_multifd_setup(VFIODevice *vbasedev, bool alloc_multifd, Error **errp) |
32 | { | 31 | { |
33 | + VFIOMigration *migration = vbasedev->migration; | 32 | VFIOMigration *migration = vbasedev->migration; |
34 | + | 33 | |
35 | + /* | ||
36 | + * Make a copy of this setting at the start in case it is changed | ||
37 | + * mid-migration. | ||
38 | + */ | ||
39 | + if (vbasedev->migration_multifd_transfer == ON_OFF_AUTO_AUTO) { | 34 | + if (vbasedev->migration_multifd_transfer == ON_OFF_AUTO_AUTO) { |
40 | + migration->multifd_transfer = vfio_multifd_transfer_supported(); | 35 | + migration->multifd_transfer = vfio_multifd_transfer_supported(); |
41 | + } else { | 36 | + } else { |
42 | + migration->multifd_transfer = | 37 | + migration->multifd_transfer = |
43 | + vbasedev->migration_multifd_transfer == ON_OFF_AUTO_ON; | 38 | + vbasedev->migration_multifd_transfer == ON_OFF_AUTO_ON; |
44 | + } | 39 | + } |
45 | + | 40 | + |
46 | if (vfio_multifd_transfer_enabled(vbasedev) && | 41 | if (!vfio_multifd_transfer_enabled(vbasedev)) { |
47 | !vfio_multifd_transfer_supported()) { | 42 | /* Nothing further to check or do */ |
48 | error_setg(errp, | 43 | return true; |
44 | } | ||
45 | |||
46 | + if (!vfio_multifd_transfer_supported()) { | ||
47 | + error_setg(errp, | ||
48 | + "%s: Multifd device transfer requested but unsupported in the current config", | ||
49 | + vbasedev->name); | ||
50 | + return false; | ||
51 | + } | ||
52 | + | ||
53 | if (alloc_multifd) { | ||
54 | assert(!migration->multifd); | ||
55 | migration->multifd = vfio_multifd_new(); | ||
49 | diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c | 56 | diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c |
50 | index XXXXXXX..XXXXXXX 100644 | 57 | index XXXXXXX..XXXXXXX 100644 |
51 | --- a/hw/vfio/pci.c | 58 | --- a/hw/vfio/pci.c |
52 | +++ b/hw/vfio/pci.c | 59 | +++ b/hw/vfio/pci.c |
53 | @@ -XXX,XX +XXX,XX @@ static const Property vfio_pci_dev_properties[] = { | 60 | @@ -XXX,XX +XXX,XX @@ static const Property vfio_pci_dev_properties[] = { |
... | ... | ||
58 | + vbasedev.migration_multifd_transfer, | 65 | + vbasedev.migration_multifd_transfer, |
59 | + ON_OFF_AUTO_AUTO), | 66 | + ON_OFF_AUTO_AUTO), |
60 | DEFINE_PROP_BOOL("migration-events", VFIOPCIDevice, | 67 | DEFINE_PROP_BOOL("migration-events", VFIOPCIDevice, |
61 | vbasedev.migration_events, false), | 68 | vbasedev.migration_events, false), |
62 | DEFINE_PROP_BOOL("x-no-mmap", VFIOPCIDevice, vbasedev.no_mmap, false), | 69 | DEFINE_PROP_BOOL("x-no-mmap", VFIOPCIDevice, vbasedev.no_mmap, false), |
70 | @@ -XXX,XX +XXX,XX @@ static void vfio_pci_dev_class_init(ObjectClass *klass, void *data) | ||
71 | pdc->exit = vfio_exitfn; | ||
72 | pdc->config_read = vfio_pci_read_config; | ||
73 | pdc->config_write = vfio_pci_write_config; | ||
74 | + | ||
75 | + object_class_property_set_description(klass, /* 10.0 */ | ||
76 | + "x-migration-multifd-transfer", | ||
77 | + "Transfer this device state via " | ||
78 | + "multifd channels when live migrating it"); | ||
79 | } | ||
80 | |||
81 | static const TypeInfo vfio_pci_dev_info = { | ||
63 | diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h | 82 | diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h |
64 | index XXXXXXX..XXXXXXX 100644 | 83 | index XXXXXXX..XXXXXXX 100644 |
65 | --- a/include/hw/vfio/vfio-common.h | 84 | --- a/include/hw/vfio/vfio-common.h |
66 | +++ b/include/hw/vfio/vfio-common.h | 85 | +++ b/include/hw/vfio/vfio-common.h |
67 | @@ -XXX,XX +XXX,XX @@ typedef struct VFIOMigration { | 86 | @@ -XXX,XX +XXX,XX @@ typedef struct VFIOMigration { |
... | ... | diff view generated by jsdifflib |
... | ... | ||
---|---|---|---|
12 | This brings this property to the same mutability level as ordinary | 12 | This brings this property to the same mutability level as ordinary |
13 | migration parameters, which too can be adjusted at the run time. | 13 | migration parameters, which too can be adjusted at the run time. |
14 | 14 | ||
15 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | 15 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> |
16 | --- | 16 | --- |
17 | hw/vfio/pci.c | 12 +++++++++--- | 17 | hw/vfio/migration-multifd.c | 4 ++++ |
18 | 1 file changed, 9 insertions(+), 3 deletions(-) | 18 | hw/vfio/pci.c | 20 +++++++++++++++++--- |
19 | 2 files changed, 21 insertions(+), 3 deletions(-) | ||
19 | 20 | ||
21 | diff --git a/hw/vfio/migration-multifd.c b/hw/vfio/migration-multifd.c | ||
22 | index XXXXXXX..XXXXXXX 100644 | ||
23 | --- a/hw/vfio/migration-multifd.c | ||
24 | +++ b/hw/vfio/migration-multifd.c | ||
25 | @@ -XXX,XX +XXX,XX @@ bool vfio_multifd_setup(VFIODevice *vbasedev, bool alloc_multifd, Error **errp) | ||
26 | { | ||
27 | VFIOMigration *migration = vbasedev->migration; | ||
28 | |||
29 | + /* | ||
30 | + * Make a copy of this setting at the start in case it is changed | ||
31 | + * mid-migration. | ||
32 | + */ | ||
33 | if (vbasedev->migration_multifd_transfer == ON_OFF_AUTO_AUTO) { | ||
34 | migration->multifd_transfer = vfio_multifd_transfer_supported(); | ||
35 | } else { | ||
20 | diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c | 36 | diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c |
21 | index XXXXXXX..XXXXXXX 100644 | 37 | index XXXXXXX..XXXXXXX 100644 |
22 | --- a/hw/vfio/pci.c | 38 | --- a/hw/vfio/pci.c |
23 | +++ b/hw/vfio/pci.c | 39 | +++ b/hw/vfio/pci.c |
24 | @@ -XXX,XX +XXX,XX @@ static void vfio_instance_init(Object *obj) | 40 | @@ -XXX,XX +XXX,XX @@ static void vfio_instance_init(Object *obj) |
25 | pci_dev->cap_present |= QEMU_PCI_CAP_EXPRESS; | 41 | pci_dev->cap_present |= QEMU_PCI_CAP_EXPRESS; |
26 | } | 42 | } |
27 | 43 | ||
28 | +static PropertyInfo qdev_prop_on_off_auto_mutable; | 44 | +static PropertyInfo vfio_pci_migration_multifd_transfer_prop; |
29 | + | 45 | + |
30 | static const Property vfio_pci_dev_properties[] = { | 46 | static const Property vfio_pci_dev_properties[] = { |
31 | DEFINE_PROP_PCI_HOST_DEVADDR("host", VFIOPCIDevice, host), | 47 | DEFINE_PROP_PCI_HOST_DEVADDR("host", VFIOPCIDevice, host), |
32 | DEFINE_PROP_UUID_NODEFAULT("vf-token", VFIOPCIDevice, vf_token), | 48 | DEFINE_PROP_UUID_NODEFAULT("vf-token", VFIOPCIDevice, vf_token), |
33 | @@ -XXX,XX +XXX,XX @@ static const Property vfio_pci_dev_properties[] = { | 49 | @@ -XXX,XX +XXX,XX @@ static const Property vfio_pci_dev_properties[] = { |
... | ... | ||
37 | - DEFINE_PROP_ON_OFF_AUTO("x-migration-multifd-transfer", VFIOPCIDevice, | 53 | - DEFINE_PROP_ON_OFF_AUTO("x-migration-multifd-transfer", VFIOPCIDevice, |
38 | - vbasedev.migration_multifd_transfer, | 54 | - vbasedev.migration_multifd_transfer, |
39 | - ON_OFF_AUTO_AUTO), | 55 | - ON_OFF_AUTO_AUTO), |
40 | + DEFINE_PROP("x-migration-multifd-transfer", VFIOPCIDevice, | 56 | + DEFINE_PROP("x-migration-multifd-transfer", VFIOPCIDevice, |
41 | + vbasedev.migration_multifd_transfer, | 57 | + vbasedev.migration_multifd_transfer, |
42 | + qdev_prop_on_off_auto_mutable, OnOffAuto, | 58 | + vfio_pci_migration_multifd_transfer_prop, OnOffAuto, |
43 | + .set_default = true, .defval.i = ON_OFF_AUTO_AUTO), | 59 | + .set_default = true, .defval.i = ON_OFF_AUTO_AUTO), |
44 | DEFINE_PROP_BOOL("migration-events", VFIOPCIDevice, | 60 | DEFINE_PROP_BOOL("migration-events", VFIOPCIDevice, |
45 | vbasedev.migration_events, false), | 61 | vbasedev.migration_events, false), |
46 | DEFINE_PROP_BOOL("x-no-mmap", VFIOPCIDevice, vbasedev.no_mmap, false), | 62 | DEFINE_PROP_BOOL("x-no-mmap", VFIOPCIDevice, vbasedev.no_mmap, false), |
47 | @@ -XXX,XX +XXX,XX @@ static const TypeInfo vfio_pci_nohotplug_dev_info = { | 63 | @@ -XXX,XX +XXX,XX @@ static const TypeInfo vfio_pci_nohotplug_dev_info = { |
48 | 64 | ||
49 | static void register_vfio_pci_dev_type(void) | 65 | static void register_vfio_pci_dev_type(void) |
50 | { | 66 | { |
51 | + qdev_prop_on_off_auto_mutable = qdev_prop_on_off_auto; | 67 | + /* |
52 | + qdev_prop_on_off_auto_mutable.realized_set_allowed = true; | 68 | + * Ordinary ON_OFF_AUTO property isn't runtime-mutable, but source VM can |
69 | + * run for a long time before being migrated so it is desirable to have a | ||
70 | + * fallback mechanism to the old way of transferring VFIO device state if | ||
71 | + * it turns to be necessary. | ||
72 | + * The following makes this type of property have the same mutability level | ||
73 | + * as ordinary migration parameters. | ||
74 | + */ | ||
75 | + vfio_pci_migration_multifd_transfer_prop = qdev_prop_on_off_auto; | ||
76 | + vfio_pci_migration_multifd_transfer_prop.realized_set_allowed = true; | ||
53 | + | 77 | + |
54 | type_register_static(&vfio_pci_dev_info); | 78 | type_register_static(&vfio_pci_dev_info); |
55 | type_register_static(&vfio_pci_nohotplug_dev_info); | 79 | type_register_static(&vfio_pci_nohotplug_dev_info); |
56 | } | 80 | } | diff view generated by jsdifflib |
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | 1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> |
---|---|---|---|
2 | 2 | ||
3 | Add a hw_compat entry for recently added x-migration-multifd-transfer VFIO | 3 | Add a hw_compat entry for recently added x-migration-multifd-transfer VFIO |
4 | property. | 4 | property. |
5 | 5 | ||
6 | Reviewed-by: Cédric Le Goater <clg@redhat.com> | ||
6 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | 7 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> |
7 | --- | 8 | --- |
8 | hw/core/machine.c | 1 + | 9 | hw/core/machine.c | 1 + |
9 | 1 file changed, 1 insertion(+) | 10 | 1 file changed, 1 insertion(+) |
10 | 11 | ||
... | ... | ||
17 | { "migration", "multifd-clean-tls-termination", "false" }, | 18 | { "migration", "multifd-clean-tls-termination", "false" }, |
18 | { "migration", "send-switchover-start", "off"}, | 19 | { "migration", "send-switchover-start", "off"}, |
19 | + { "vfio-pci", "x-migration-multifd-transfer", "off" }, | 20 | + { "vfio-pci", "x-migration-multifd-transfer", "off" }, |
20 | }; | 21 | }; |
21 | const size_t hw_compat_9_2_len = G_N_ELEMENTS(hw_compat_9_2); | 22 | const size_t hw_compat_9_2_len = G_N_ELEMENTS(hw_compat_9_2); |
23 | |||
24 | diff view generated by jsdifflib |
... | ... | ||
---|---|---|---|
9 | migration use cases and the right value depends on the particular setup | 9 | migration use cases and the right value depends on the particular setup |
10 | disable the limit by default by setting it to UINT64_MAX. | 10 | disable the limit by default by setting it to UINT64_MAX. |
11 | 11 | ||
12 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | 12 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> |
13 | --- | 13 | --- |
14 | hw/vfio/migration-multifd.c | 14 ++++++++++++++ | 14 | hw/vfio/migration-multifd.c | 16 ++++++++++++++++ |
15 | hw/vfio/pci.c | 2 ++ | 15 | hw/vfio/pci.c | 9 +++++++++ |
16 | include/hw/vfio/vfio-common.h | 1 + | 16 | include/hw/vfio/vfio-common.h | 1 + |
17 | 3 files changed, 17 insertions(+) | 17 | 3 files changed, 26 insertions(+) |
18 | 18 | ||
19 | diff --git a/hw/vfio/migration-multifd.c b/hw/vfio/migration-multifd.c | 19 | diff --git a/hw/vfio/migration-multifd.c b/hw/vfio/migration-multifd.c |
20 | index XXXXXXX..XXXXXXX 100644 | 20 | index XXXXXXX..XXXXXXX 100644 |
21 | --- a/hw/vfio/migration-multifd.c | 21 | --- a/hw/vfio/migration-multifd.c |
22 | +++ b/hw/vfio/migration-multifd.c | 22 | +++ b/hw/vfio/migration-multifd.c |
... | ... | ||
34 | 34 | ||
35 | + multifd->load_buf_queued_pending_buffers++; | 35 | + multifd->load_buf_queued_pending_buffers++; |
36 | + if (multifd->load_buf_queued_pending_buffers > | 36 | + if (multifd->load_buf_queued_pending_buffers > |
37 | + vbasedev->migration_max_queued_buffers) { | 37 | + vbasedev->migration_max_queued_buffers) { |
38 | + error_setg(errp, | 38 | + error_setg(errp, |
39 | + "queuing state buffer %" PRIu32 " would exceed the max of %" PRIu64, | 39 | + "%s: queuing state buffer %" PRIu32 |
40 | + packet->idx, vbasedev->migration_max_queued_buffers); | 40 | + " would exceed the max of %" PRIu64, |
41 | + vbasedev->name, packet->idx, | ||
42 | + vbasedev->migration_max_queued_buffers); | ||
41 | + return false; | 43 | + return false; |
42 | + } | 44 | + } |
43 | + | 45 | + |
44 | lb->data = g_memdup2(&packet->data, packet_total_size - sizeof(*packet)); | 46 | lb->data = g_memdup2(&packet->data, packet_total_size - sizeof(*packet)); |
45 | lb->len = packet_total_size - sizeof(*packet); | 47 | lb->len = packet_total_size - sizeof(*packet); |
46 | lb->is_present = true; | 48 | lb->is_present = true; |
47 | @@ -XXX,XX +XXX,XX @@ static bool vfio_load_bufs_thread(void *opaque, bool *should_quit, Error **errp) | 49 | @@ -XXX,XX +XXX,XX @@ static bool vfio_load_bufs_thread(void *opaque, bool *should_quit, Error **errp) |
48 | goto ret_signal; | 50 | goto thread_exit; |
49 | } | 51 | } |
50 | 52 | ||
51 | + assert(multifd->load_buf_queued_pending_buffers > 0); | 53 | + assert(multifd->load_buf_queued_pending_buffers > 0); |
52 | + multifd->load_buf_queued_pending_buffers--; | 54 | + multifd->load_buf_queued_pending_buffers--; |
53 | + | 55 | + |
54 | if (multifd->load_buf_idx == multifd->load_buf_idx_last - 1) { | 56 | if (multifd->load_buf_idx == multifd->load_buf_idx_last - 1) { |
55 | trace_vfio_load_state_device_buffer_end(vbasedev->name); | 57 | trace_vfio_load_state_device_buffer_end(vbasedev->name); |
56 | } | 58 | } |
57 | @@ -XXX,XX +XXX,XX @@ VFIOMultifd *vfio_multifd_new(void) | 59 | @@ -XXX,XX +XXX,XX @@ static VFIOMultifd *vfio_multifd_new(void) |
58 | 60 | ||
59 | multifd->load_buf_idx = 0; | 61 | multifd->load_buf_idx = 0; |
60 | multifd->load_buf_idx_last = UINT32_MAX; | 62 | multifd->load_buf_idx_last = UINT32_MAX; |
61 | + multifd->load_buf_queued_pending_buffers = 0; | 63 | + multifd->load_buf_queued_pending_buffers = 0; |
62 | qemu_cond_init(&multifd->load_bufs_buffer_ready_cond); | 64 | qemu_cond_init(&multifd->load_bufs_buffer_ready_cond); |
... | ... | ||
66 | index XXXXXXX..XXXXXXX 100644 | 68 | index XXXXXXX..XXXXXXX 100644 |
67 | --- a/hw/vfio/pci.c | 69 | --- a/hw/vfio/pci.c |
68 | +++ b/hw/vfio/pci.c | 70 | +++ b/hw/vfio/pci.c |
69 | @@ -XXX,XX +XXX,XX @@ static const Property vfio_pci_dev_properties[] = { | 71 | @@ -XXX,XX +XXX,XX @@ static const Property vfio_pci_dev_properties[] = { |
70 | vbasedev.migration_multifd_transfer, | 72 | vbasedev.migration_multifd_transfer, |
71 | qdev_prop_on_off_auto_mutable, OnOffAuto, | 73 | vfio_pci_migration_multifd_transfer_prop, OnOffAuto, |
72 | .set_default = true, .defval.i = ON_OFF_AUTO_AUTO), | 74 | .set_default = true, .defval.i = ON_OFF_AUTO_AUTO), |
73 | + DEFINE_PROP_UINT64("x-migration-max-queued-buffers", VFIOPCIDevice, | 75 | + DEFINE_PROP_UINT64("x-migration-max-queued-buffers", VFIOPCIDevice, |
74 | + vbasedev.migration_max_queued_buffers, UINT64_MAX), | 76 | + vbasedev.migration_max_queued_buffers, UINT64_MAX), |
75 | DEFINE_PROP_BOOL("migration-events", VFIOPCIDevice, | 77 | DEFINE_PROP_BOOL("migration-events", VFIOPCIDevice, |
76 | vbasedev.migration_events, false), | 78 | vbasedev.migration_events, false), |
77 | DEFINE_PROP_BOOL("x-no-mmap", VFIOPCIDevice, vbasedev.no_mmap, false), | 79 | DEFINE_PROP_BOOL("x-no-mmap", VFIOPCIDevice, vbasedev.no_mmap, false), |
80 | @@ -XXX,XX +XXX,XX @@ static void vfio_pci_dev_class_init(ObjectClass *klass, void *data) | ||
81 | "x-migration-multifd-transfer", | ||
82 | "Transfer this device state via " | ||
83 | "multifd channels when live migrating it"); | ||
84 | + object_class_property_set_description(klass, /* 10.0 */ | ||
85 | + "x-migration-max-queued-buffers", | ||
86 | + "Maximum count of in-flight VFIO " | ||
87 | + "device state buffers queued at the " | ||
88 | + "destination when doing live " | ||
89 | + "migration of device state via " | ||
90 | + "multifd channels"); | ||
91 | } | ||
92 | |||
93 | static const TypeInfo vfio_pci_dev_info = { | ||
78 | diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h | 94 | diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h |
79 | index XXXXXXX..XXXXXXX 100644 | 95 | index XXXXXXX..XXXXXXX 100644 |
80 | --- a/include/hw/vfio/vfio-common.h | 96 | --- a/include/hw/vfio/vfio-common.h |
81 | +++ b/include/hw/vfio/vfio-common.h | 97 | +++ b/include/hw/vfio/vfio-common.h |
82 | @@ -XXX,XX +XXX,XX @@ typedef struct VFIODevice { | 98 | @@ -XXX,XX +XXX,XX @@ typedef struct VFIODevice { |
... | ... | diff view generated by jsdifflib |
... | ... | ||
---|---|---|---|
8 | The property defaults to AUTO, which means ON for ARM, OFF for other | 8 | The property defaults to AUTO, which means ON for ARM, OFF for other |
9 | platforms. | 9 | platforms. |
10 | 10 | ||
11 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | 11 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> |
12 | --- | 12 | --- |
13 | hw/vfio/migration-multifd.c | 92 +++++++++++++++++++++++++++++++++++ | 13 | hw/vfio/migration-multifd.c | 91 +++++++++++++++++++++++++++++++++++ |
14 | hw/vfio/migration-multifd.h | 3 ++ | 14 | hw/vfio/migration-multifd.h | 3 ++ |
15 | hw/vfio/migration.c | 10 +++- | 15 | hw/vfio/migration.c | 10 +++- |
16 | hw/vfio/pci.c | 3 ++ | 16 | hw/vfio/pci.c | 9 ++++ |
17 | include/hw/vfio/vfio-common.h | 2 + | 17 | include/hw/vfio/vfio-common.h | 2 + |
18 | 5 files changed, 109 insertions(+), 1 deletion(-) | 18 | 5 files changed, 114 insertions(+), 1 deletion(-) |
19 | 19 | ||
20 | diff --git a/hw/vfio/migration-multifd.c b/hw/vfio/migration-multifd.c | 20 | diff --git a/hw/vfio/migration-multifd.c b/hw/vfio/migration-multifd.c |
21 | index XXXXXXX..XXXXXXX 100644 | 21 | index XXXXXXX..XXXXXXX 100644 |
22 | --- a/hw/vfio/migration-multifd.c | 22 | --- a/hw/vfio/migration-multifd.c |
23 | +++ b/hw/vfio/migration-multifd.c | 23 | +++ b/hw/vfio/migration-multifd.c |
... | ... | ||
76 | + * Need to re-check cancellation immediately after wait in case | 76 | + * Need to re-check cancellation immediately after wait in case |
77 | + * cond was signalled by vfio_load_cleanup_load_bufs_thread(). | 77 | + * cond was signalled by vfio_load_cleanup_load_bufs_thread(). |
78 | + */ | 78 | + */ |
79 | + if (vfio_load_bufs_thread_want_exit(multifd, should_quit)) { | 79 | + if (vfio_load_bufs_thread_want_exit(multifd, should_quit)) { |
80 | + error_setg(errp, "operation cancelled"); | 80 | + error_setg(errp, "operation cancelled"); |
81 | + ret = false; | 81 | + goto thread_exit; |
82 | + goto ret_signal; | ||
83 | + } | 82 | + } |
84 | + } | 83 | + } |
85 | + } | 84 | + } |
86 | + | 85 | + |
87 | config_ret = vfio_load_bufs_thread_load_config(vbasedev); | 86 | if (!vfio_load_bufs_thread_load_config(vbasedev, errp)) { |
88 | if (config_ret) { | 87 | goto thread_exit; |
89 | error_setg(errp, "load config state failed: %d", config_ret); | 88 | } |
90 | @@ -XXX,XX +XXX,XX @@ ret_signal: | 89 | @@ -XXX,XX +XXX,XX @@ thread_exit: |
91 | return ret; | 90 | return ret; |
92 | } | 91 | } |
93 | 92 | ||
94 | +int vfio_load_state_config_load_ready(VFIODevice *vbasedev) | 93 | +int vfio_load_state_config_load_ready(VFIODevice *vbasedev) |
95 | +{ | 94 | +{ |
... | ... | ||
131 | + } | 130 | + } |
132 | + | 131 | + |
133 | + return ret; | 132 | + return ret; |
134 | +} | 133 | +} |
135 | + | 134 | + |
136 | VFIOMultifd *vfio_multifd_new(void) | 135 | static VFIOMultifd *vfio_multifd_new(void) |
137 | { | 136 | { |
138 | VFIOMultifd *multifd = g_new(VFIOMultifd, 1); | 137 | VFIOMultifd *multifd = g_new(VFIOMultifd, 1); |
139 | @@ -XXX,XX +XXX,XX @@ VFIOMultifd *vfio_multifd_new(void) | 138 | @@ -XXX,XX +XXX,XX @@ static VFIOMultifd *vfio_multifd_new(void) |
140 | multifd->load_buf_queued_pending_buffers = 0; | 139 | multifd->load_buf_queued_pending_buffers = 0; |
141 | qemu_cond_init(&multifd->load_bufs_buffer_ready_cond); | 140 | qemu_cond_init(&multifd->load_bufs_buffer_ready_cond); |
142 | 141 | ||
143 | + multifd->load_bufs_iter_done = false; | 142 | + multifd->load_bufs_iter_done = false; |
144 | + qemu_cond_init(&multifd->load_bufs_iter_done_cond); | 143 | + qemu_cond_init(&multifd->load_bufs_iter_done_cond); |
... | ... | ||
152 | qemu_cond_signal(&multifd->load_bufs_buffer_ready_cond); | 151 | qemu_cond_signal(&multifd->load_bufs_buffer_ready_cond); |
153 | + qemu_cond_signal(&multifd->load_bufs_iter_done_cond); | 152 | + qemu_cond_signal(&multifd->load_bufs_iter_done_cond); |
154 | qemu_cond_wait(&multifd->load_bufs_thread_finished_cond, | 153 | qemu_cond_wait(&multifd->load_bufs_thread_finished_cond, |
155 | &multifd->load_bufs_mutex); | 154 | &multifd->load_bufs_mutex); |
156 | } | 155 | } |
157 | @@ -XXX,XX +XXX,XX @@ void vfio_multifd_free(VFIOMultifd *multifd) | 156 | @@ -XXX,XX +XXX,XX @@ static void vfio_multifd_free(VFIOMultifd *multifd) |
158 | vfio_load_cleanup_load_bufs_thread(multifd); | 157 | vfio_load_cleanup_load_bufs_thread(multifd); |
159 | 158 | ||
160 | qemu_cond_destroy(&multifd->load_bufs_thread_finished_cond); | 159 | qemu_cond_destroy(&multifd->load_bufs_thread_finished_cond); |
161 | + qemu_cond_destroy(&multifd->load_bufs_iter_done_cond); | 160 | + qemu_cond_destroy(&multifd->load_bufs_iter_done_cond); |
162 | vfio_state_buffers_destroy(&multifd->load_bufs); | 161 | vfio_state_buffers_destroy(&multifd->load_bufs); |
163 | qemu_cond_destroy(&multifd->load_bufs_buffer_ready_cond); | 162 | qemu_cond_destroy(&multifd->load_bufs_buffer_ready_cond); |
164 | qemu_mutex_destroy(&multifd->load_bufs_mutex); | 163 | qemu_mutex_destroy(&multifd->load_bufs_mutex); |
165 | diff --git a/hw/vfio/migration-multifd.h b/hw/vfio/migration-multifd.h | 164 | diff --git a/hw/vfio/migration-multifd.h b/hw/vfio/migration-multifd.h |
166 | index XXXXXXX..XXXXXXX 100644 | 165 | index XXXXXXX..XXXXXXX 100644 |
167 | --- a/hw/vfio/migration-multifd.h | 166 | --- a/hw/vfio/migration-multifd.h |
168 | +++ b/hw/vfio/migration-multifd.h | 167 | +++ b/hw/vfio/migration-multifd.h |
169 | @@ -XXX,XX +XXX,XX @@ bool vfio_multifd_transfer_enabled(VFIODevice *vbasedev); | 168 | @@ -XXX,XX +XXX,XX @@ void vfio_multifd_cleanup(VFIODevice *vbasedev); |
170 | 169 | bool vfio_multifd_transfer_supported(void); | |
171 | bool vfio_multifd_transfer_setup(VFIODevice *vbasedev, Error **errp); | 170 | bool vfio_multifd_transfer_enabled(VFIODevice *vbasedev); |
172 | 171 | ||
173 | +bool vfio_load_config_after_iter(VFIODevice *vbasedev); | 172 | +bool vfio_load_config_after_iter(VFIODevice *vbasedev); |
174 | bool vfio_load_state_buffer(void *opaque, char *data, size_t data_size, | 173 | bool vfio_multifd_load_state_buffer(void *opaque, char *data, size_t data_size, |
175 | Error **errp); | 174 | Error **errp); |
176 | 175 | ||
177 | +int vfio_load_state_config_load_ready(VFIODevice *vbasedev); | 176 | +int vfio_load_state_config_load_ready(VFIODevice *vbasedev); |
178 | + | 177 | + |
179 | void vfio_multifd_emit_dummy_eos(VFIODevice *vbasedev, QEMUFile *f); | 178 | void vfio_multifd_emit_dummy_eos(VFIODevice *vbasedev, QEMUFile *f); |
180 | 179 | ||
181 | bool vfio_save_complete_precopy_thread(SaveLiveCompletePrecopyThreadData *d, | 180 | bool |
182 | diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c | 181 | diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c |
183 | index XXXXXXX..XXXXXXX 100644 | 182 | index XXXXXXX..XXXXXXX 100644 |
184 | --- a/hw/vfio/migration.c | 183 | --- a/hw/vfio/migration.c |
185 | +++ b/hw/vfio/migration.c | 184 | +++ b/hw/vfio/migration.c |
186 | @@ -XXX,XX +XXX,XX @@ static void vfio_save_state(QEMUFile *f, void *opaque) | 185 | @@ -XXX,XX +XXX,XX @@ static void vfio_save_state(QEMUFile *f, void *opaque) |
... | ... | ||
211 | index XXXXXXX..XXXXXXX 100644 | 210 | index XXXXXXX..XXXXXXX 100644 |
212 | --- a/hw/vfio/pci.c | 211 | --- a/hw/vfio/pci.c |
213 | +++ b/hw/vfio/pci.c | 212 | +++ b/hw/vfio/pci.c |
214 | @@ -XXX,XX +XXX,XX @@ static const Property vfio_pci_dev_properties[] = { | 213 | @@ -XXX,XX +XXX,XX @@ static const Property vfio_pci_dev_properties[] = { |
215 | vbasedev.migration_multifd_transfer, | 214 | vbasedev.migration_multifd_transfer, |
216 | qdev_prop_on_off_auto_mutable, OnOffAuto, | 215 | vfio_pci_migration_multifd_transfer_prop, OnOffAuto, |
217 | .set_default = true, .defval.i = ON_OFF_AUTO_AUTO), | 216 | .set_default = true, .defval.i = ON_OFF_AUTO_AUTO), |
218 | + DEFINE_PROP_ON_OFF_AUTO("x-migration-load-config-after-iter", VFIOPCIDevice, | 217 | + DEFINE_PROP_ON_OFF_AUTO("x-migration-load-config-after-iter", VFIOPCIDevice, |
219 | + vbasedev.migration_load_config_after_iter, | 218 | + vbasedev.migration_load_config_after_iter, |
220 | + ON_OFF_AUTO_AUTO), | 219 | + ON_OFF_AUTO_AUTO), |
221 | DEFINE_PROP_UINT64("x-migration-max-queued-buffers", VFIOPCIDevice, | 220 | DEFINE_PROP_UINT64("x-migration-max-queued-buffers", VFIOPCIDevice, |
222 | vbasedev.migration_max_queued_buffers, UINT64_MAX), | 221 | vbasedev.migration_max_queued_buffers, UINT64_MAX), |
223 | DEFINE_PROP_BOOL("migration-events", VFIOPCIDevice, | 222 | DEFINE_PROP_BOOL("migration-events", VFIOPCIDevice, |
223 | @@ -XXX,XX +XXX,XX @@ static void vfio_pci_dev_class_init(ObjectClass *klass, void *data) | ||
224 | "x-migration-multifd-transfer", | ||
225 | "Transfer this device state via " | ||
226 | "multifd channels when live migrating it"); | ||
227 | + object_class_property_set_description(klass, /* 10.0 */ | ||
228 | + "x-migration-load-config-after-iter", | ||
229 | + "Start the config load only after " | ||
230 | + "all iterables were loaded when doing " | ||
231 | + "live migration of device state via " | ||
232 | + "multifd channels"); | ||
233 | object_class_property_set_description(klass, /* 10.0 */ | ||
234 | "x-migration-max-queued-buffers", | ||
235 | "Maximum count of in-flight VFIO " | ||
224 | diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h | 236 | diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h |
225 | index XXXXXXX..XXXXXXX 100644 | 237 | index XXXXXXX..XXXXXXX 100644 |
226 | --- a/include/hw/vfio/vfio-common.h | 238 | --- a/include/hw/vfio/vfio-common.h |
227 | +++ b/include/hw/vfio/vfio-common.h | 239 | +++ b/include/hw/vfio/vfio-common.h |
228 | @@ -XXX,XX +XXX,XX @@ | 240 | @@ -XXX,XX +XXX,XX @@ |
... | ... | diff view generated by jsdifflib |
... | ... | ||
---|---|---|---|
3 | Update the VFIO documentation at docs/devel/migration describing the | 3 | Update the VFIO documentation at docs/devel/migration describing the |
4 | changes brought by the multifd device state transfer. | 4 | changes brought by the multifd device state transfer. |
5 | 5 | ||
6 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | 6 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> |
7 | --- | 7 | --- |
8 | docs/devel/migration/vfio.rst | 80 +++++++++++++++++++++++++++++++---- | 8 | docs/devel/migration/vfio.rst | 79 +++++++++++++++++++++++++++++++---- |
9 | 1 file changed, 71 insertions(+), 9 deletions(-) | 9 | 1 file changed, 72 insertions(+), 7 deletions(-) |
10 | 10 | ||
11 | diff --git a/docs/devel/migration/vfio.rst b/docs/devel/migration/vfio.rst | 11 | diff --git a/docs/devel/migration/vfio.rst b/docs/devel/migration/vfio.rst |
12 | index XXXXXXX..XXXXXXX 100644 | 12 | index XXXXXXX..XXXXXXX 100644 |
13 | --- a/docs/devel/migration/vfio.rst | 13 | --- a/docs/devel/migration/vfio.rst |
14 | +++ b/docs/devel/migration/vfio.rst | 14 | +++ b/docs/devel/migration/vfio.rst |
15 | @@ -XXX,XX +XXX,XX @@ helps to reduce the total downtime of the VM. VFIO devices opt-in to pre-copy | 15 | @@ -XXX,XX +XXX,XX @@ VFIO implements the device hooks for the iterative approach as follows: |
16 | support by reporting the VFIO_MIGRATION_PRE_COPY flag in the | 16 | * A ``switchover_ack_needed`` function that checks if the VFIO device uses |
17 | VFIO_DEVICE_FEATURE_MIGRATION ioctl. | 17 | "switchover-ack" migration capability when this capability is enabled. |
18 | 18 | ||
19 | -* A ``save_state`` function to save the device config space if it is present. | ||
20 | +* A ``switchover_start`` function that in the multifd mode starts a thread that | ||
21 | + reassembles the multifd received data and loads it in-order into the device. | ||
22 | + In the non-multifd mode this function is a NOP. | ||
23 | |||
24 | -* A ``save_live_complete_precopy`` function that sets the VFIO device in | ||
25 | - _STOP_COPY state and iteratively copies the data for the VFIO device until | ||
26 | - the vendor driver indicates that no data remains. | ||
27 | +* A ``save_state`` function to save the device config space if it is present | ||
28 | + in the non-multifd mode. | ||
29 | + In the multifd mode it just emits either a dummy EOS marker or | ||
30 | + "all iterables were loaded" flag for configurations that need to defer | ||
31 | + loading device config space after them. | ||
32 | |||
33 | -* A ``load_state`` function that loads the config section and the data | ||
34 | - sections that are generated by the save functions above. | ||
35 | +* A ``save_live_complete_precopy`` function that in the non-multifd mode sets | ||
36 | + the VFIO device in _STOP_COPY state and iteratively copies the data for the | ||
37 | + VFIO device until the vendor driver indicates that no data remains. | ||
38 | + In the multifd mode it just emits a dummy EOS marker. | ||
39 | + | ||
40 | +* A ``save_live_complete_precopy_thread`` function that in the multifd mode | ||
41 | + provides thread handler performing multifd device state transfer. | ||
42 | + It sets the VFIO device to _STOP_COPY state, iteratively reads the data | ||
43 | + from the VFIO device and queues it for multifd transmission until the vendor | ||
44 | + driver indicates that no data remains. | ||
45 | + After that, it saves the device config space and queues it for multifd | ||
46 | + transfer too. | ||
47 | + In the non-multifd mode this thread is a NOP. | ||
48 | + | ||
49 | +* A ``load_state`` function that loads the data sections that are generated | ||
50 | + by the main migration channel save functions above. | ||
51 | + In the non-multifd mode it also loads the config section, while in the | ||
52 | + multifd mode it handles the optional "all iterables were loaded" flag if | ||
53 | + it is in use. | ||
54 | + | ||
55 | +* A ``load_state_buffer`` function that loads the device state and the device | ||
56 | + config that arrived via multifd channels. | ||
57 | + It's used only in the multifd mode. | ||
58 | |||
59 | * ``cleanup`` functions for both save and load that perform any migration | ||
60 | related cleanup. | ||
61 | @@ -XXX,XX +XXX,XX @@ Live migration save path | ||
62 | Then the VFIO device is put in _STOP_COPY state | ||
63 | (FINISH_MIGRATE, _ACTIVE, _STOP_COPY) | ||
64 | .save_live_complete_precopy() is called for each active device | ||
65 | - For the VFIO device, iterate in .save_live_complete_precopy() until | ||
66 | + For the VFIO device: in the non-multifd mode iterate in | ||
67 | + .save_live_complete_precopy() until | ||
68 | pending data is 0 | ||
69 | + In the multifd mode this iteration is done in | ||
70 | + .save_live_complete_precopy_thread() instead. | ||
71 | | | ||
72 | (POSTMIGRATE, _COMPLETED, _STOP_COPY) | ||
73 | Migraton thread schedules cleanup bottom half and exits | ||
74 | @@ -XXX,XX +XXX,XX @@ Live migration resume path | ||
75 | (RESTORE_VM, _ACTIVE, _STOP) | ||
76 | | | ||
77 | For each device, .load_state() is called for that device section data | ||
78 | + transmitted via the main migration channel. | ||
79 | + For data transmitted via multifd channels .load_state_buffer() is called | ||
80 | + instead. | ||
81 | (RESTORE_VM, _ACTIVE, _RESUMING) | ||
82 | | | ||
83 | At the end, .load_cleanup() is called for each device and vCPUs are started | ||
84 | @@ -XXX,XX +XXX,XX @@ Postcopy | ||
85 | ======== | ||
86 | |||
87 | Postcopy migration is currently not supported for VFIO devices. | ||
88 | + | ||
89 | +Multifd | ||
90 | +======= | ||
91 | + | ||
19 | +Starting from QEMU version 10.0 there's a possibility to transfer VFIO device | 92 | +Starting from QEMU version 10.0 there's a possibility to transfer VFIO device |
20 | +_STOP_COPY state via multifd channels. This helps reduce downtime - especially | 93 | +_STOP_COPY state via multifd channels. This helps reduce downtime - especially |
21 | +with multiple VFIO devices or with devices having a large migration state. | 94 | +with multiple VFIO devices or with devices having a large migration state. |
22 | +As an additional benefit, setting the VFIO device to _STOP_COPY state and | 95 | +As an additional benefit, setting the VFIO device to _STOP_COPY state and |
23 | +saving its config space is also parallelized (run in a separate thread) in | 96 | +saving its config space is also parallelized (run in a separate thread) in |
... | ... | ||
44 | +Some host platforms (like ARM64) require that VFIO device config is loaded only | 117 | +Some host platforms (like ARM64) require that VFIO device config is loaded only |
45 | +after all iterables were loaded. | 118 | +after all iterables were loaded. |
46 | +Such interlocking is controlled by "x-migration-load-config-after-iter" VFIO | 119 | +Such interlocking is controlled by "x-migration-load-config-after-iter" VFIO |
47 | +device property, which in its default setting (AUTO) does so only on platforms | 120 | +device property, which in its default setting (AUTO) does so only on platforms |
48 | +that actually require it. | 121 | +that actually require it. |
49 | + | ||
50 | When pre-copy is supported, it's possible to further reduce downtime by | ||
51 | enabling "switchover-ack" migration capability. | ||
52 | VFIO migration uAPI defines "initial bytes" as part of its pre-copy data stream | ||
53 | @@ -XXX,XX +XXX,XX @@ VFIO implements the device hooks for the iterative approach as follows: | ||
54 | * A ``switchover_ack_needed`` function that checks if the VFIO device uses | ||
55 | "switchover-ack" migration capability when this capability is enabled. | ||
56 | |||
57 | -* A ``save_state`` function to save the device config space if it is present. | ||
58 | - | ||
59 | -* A ``save_live_complete_precopy`` function that sets the VFIO device in | ||
60 | - _STOP_COPY state and iteratively copies the data for the VFIO device until | ||
61 | - the vendor driver indicates that no data remains. | ||
62 | - | ||
63 | -* A ``load_state`` function that loads the config section and the data | ||
64 | - sections that are generated by the save functions above. | ||
65 | +* A ``switchover_start`` function that in the multifd mode starts a thread that | ||
66 | + reassembles the multifd received data and loads it in-order into the device. | ||
67 | + In the non-multifd mode this function is a NOP. | ||
68 | + | ||
69 | +* A ``save_state`` function to save the device config space if it is present | ||
70 | + in the non-multifd mode. | ||
71 | + In the multifd mode it just emits either a dummy EOS marker or | ||
72 | + "all iterables were loaded" flag for configurations that need to defer | ||
73 | + loading device config space after them. | ||
74 | + | ||
75 | +* A ``save_live_complete_precopy`` function that in the non-multifd mode sets | ||
76 | + the VFIO device in _STOP_COPY state and iteratively copies the data for the | ||
77 | + VFIO device until the vendor driver indicates that no data remains. | ||
78 | + In the multifd mode it just emits a dummy EOS marker. | ||
79 | + | ||
80 | +* A ``save_live_complete_precopy_thread`` function that in the multifd mode | ||
81 | + provides thread handler performing multifd device state transfer. | ||
82 | + It sets the VFIO device to _STOP_COPY state, iteratively reads the data | ||
83 | + from the VFIO device and queues it for multifd transmission until the vendor | ||
84 | + driver indicates that no data remains. | ||
85 | + After that, it saves the device config space and queues it for multifd | ||
86 | + transfer too. | ||
87 | + In the non-multifd mode this thread is a NOP. | ||
88 | + | ||
89 | +* A ``load_state`` function that loads the data sections that are generated | ||
90 | + by the main migration channel save functions above. | ||
91 | + In the non-multifd mode it also loads the config section, while in the | ||
92 | + multifd mode it handles the optional "all iterables were loaded" flag if | ||
93 | + it is in use. | ||
94 | + | ||
95 | +* A ``load_state_buffer`` function that loads the device state and the device | ||
96 | + config that arrived via multifd channels. | ||
97 | + It's used only in the multifd mode. | ||
98 | |||
99 | * ``cleanup`` functions for both save and load that perform any migration | ||
100 | related cleanup. | ||
101 | @@ -XXX,XX +XXX,XX @@ Live migration save path | ||
102 | Then the VFIO device is put in _STOP_COPY state | ||
103 | (FINISH_MIGRATE, _ACTIVE, _STOP_COPY) | ||
104 | .save_live_complete_precopy() is called for each active device | ||
105 | - For the VFIO device, iterate in .save_live_complete_precopy() until | ||
106 | + For the VFIO device: in the non-multifd mode iterate in | ||
107 | + .save_live_complete_precopy() until | ||
108 | pending data is 0 | ||
109 | + In the multifd mode this iteration is done in | ||
110 | + .save_live_complete_precopy_thread() instead. | ||
111 | | | ||
112 | (POSTMIGRATE, _COMPLETED, _STOP_COPY) | ||
113 | Migraton thread schedules cleanup bottom half and exits | ||
114 | @@ -XXX,XX +XXX,XX @@ Live migration resume path | ||
115 | (RESTORE_VM, _ACTIVE, _STOP) | ||
116 | | | ||
117 | For each device, .load_state() is called for that device section data | ||
118 | + transmitted via the main migration channel. | ||
119 | + For data transmitted via multifd channels .load_state_buffer() is called | ||
120 | + instead. | ||
121 | (RESTORE_VM, _ACTIVE, _RESUMING) | ||
122 | | | ||
123 | At the end, .load_cleanup() is called for each device and vCPUs are started | diff view generated by jsdifflib |
New patch | |||
---|---|---|---|
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | ||
1 | 2 | ||
3 | All callers to migration_incoming_state_destroy() other than | ||
4 | postcopy_ram_listen_thread() do this call with BQL held. | ||
5 | |||
6 | Since migration_incoming_state_destroy() ultimately calls "load_cleanup" | ||
7 | SaveVMHandlers and it will soon call BQL-sensitive code it makes sense | ||
8 | to always call that function under BQL rather than to have it deal with | ||
9 | both cases (with BQL and without BQL). | ||
10 | Add the necessary bql_lock() and bql_unlock() to | ||
11 | postcopy_ram_listen_thread(). | ||
12 | |||
13 | qemu_loadvm_state_main() in postcopy_ram_listen_thread() could call | ||
14 | "load_state" SaveVMHandlers that are expecting BQL to be held. | ||
15 | |||
16 | In principle, the only devices that should be arriving on migration | ||
17 | channel serviced by postcopy_ram_listen_thread() are those that are | ||
18 | postcopiable and whose load handlers are safe to be called without BQL | ||
19 | being held. | ||
20 | |||
21 | But nothing currently prevents the source from sending data for "unsafe" | ||
22 | devices which would cause trouble there. | ||
23 | Add a TODO comment there so it's clear that it would be good to improve | ||
24 | handling of such (erroneous) case in the future. | ||
25 | |||
26 | Acked-by: Peter Xu <peterx@redhat.com> | ||
27 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | ||
28 | --- | ||
29 | migration/migration.c | 13 +++++++++++++ | ||
30 | migration/savevm.c | 4 ++++ | ||
31 | 2 files changed, 17 insertions(+) | ||
32 | |||
33 | diff --git a/migration/migration.c b/migration/migration.c | ||
34 | index XXXXXXX..XXXXXXX 100644 | ||
35 | --- a/migration/migration.c | ||
36 | +++ b/migration/migration.c | ||
37 | @@ -XXX,XX +XXX,XX @@ void migration_incoming_state_destroy(void) | ||
38 | struct MigrationIncomingState *mis = migration_incoming_get_current(); | ||
39 | |||
40 | multifd_recv_cleanup(); | ||
41 | + | ||
42 | /* | ||
43 | * RAM state cleanup needs to happen after multifd cleanup, because | ||
44 | * multifd threads can use some of its states (receivedmap). | ||
45 | + * The VFIO load_cleanup() implementation is BQL-sensitive. It requires | ||
46 | + * BQL must NOT be taken when recycling load threads, so that it won't | ||
47 | + * block the load threads from making progress on address space | ||
48 | + * modification operations. | ||
49 | + * | ||
50 | + * To make it work, we could try to not take BQL for all load_cleanup(), | ||
51 | + * or conditionally unlock BQL only if bql_locked() in VFIO. | ||
52 | + * | ||
53 | + * Since most existing call sites take BQL for load_cleanup(), make | ||
54 | + * it simple by taking BQL always as the rule, so that VFIO can unlock | ||
55 | + * BQL and retake unconditionally. | ||
56 | */ | ||
57 | + assert(bql_locked()); | ||
58 | qemu_loadvm_state_cleanup(); | ||
59 | |||
60 | if (mis->to_src_file) { | ||
61 | diff --git a/migration/savevm.c b/migration/savevm.c | ||
62 | index XXXXXXX..XXXXXXX 100644 | ||
63 | --- a/migration/savevm.c | ||
64 | +++ b/migration/savevm.c | ||
65 | @@ -XXX,XX +XXX,XX @@ static void *postcopy_ram_listen_thread(void *opaque) | ||
66 | * in qemu_file, and thus we must be blocking now. | ||
67 | */ | ||
68 | qemu_file_set_blocking(f, true); | ||
69 | + | ||
70 | + /* TODO: sanity check that only postcopiable data will be loaded here */ | ||
71 | load_res = qemu_loadvm_state_main(f, mis); | ||
72 | |||
73 | /* | ||
74 | @@ -XXX,XX +XXX,XX @@ static void *postcopy_ram_listen_thread(void *opaque) | ||
75 | * (If something broke then qemu will have to exit anyway since it's | ||
76 | * got a bad migration state). | ||
77 | */ | ||
78 | + bql_lock(); | ||
79 | migration_incoming_state_destroy(); | ||
80 | + bql_unlock(); | ||
81 | |||
82 | rcu_unregister_thread(); | ||
83 | mis->have_listen_thread = false; | diff view generated by jsdifflib |