1 | The following changes since commit e8a01102936286e012ed0f00bd7f3b7474d415c9: | 1 | The following changes since commit 2ff49e96accc8fd9a38e9abd16f0cfa0adab1605: |
---|---|---|---|
2 | 2 | ||
3 | Merge tag 'ui-pull-request' of https://gitlab.com/marcandre.lureau/qemu into staging (2025-03-05 21:58:23 +0800) | 3 | Merge tag 'pull-tcg-20230709' of https://gitlab.com/rth7680/qemu into staging (2023-07-09 15:01:43 +0100) |
4 | 4 | ||
5 | are available in the Git repository at: | 5 | are available in the Git repository at: |
6 | 6 | ||
7 | https://github.com/legoater/qemu/ tags/pull-vfio-20250306 | 7 | https://github.com/legoater/qemu/ tags/pull-vfio-20230710 |
8 | 8 | ||
9 | for you to fetch changes up to 59a67e70950bcc2002d3a8d22a17743e0f70da96: | 9 | for you to fetch changes up to c00aac6f1428d40a4ca2ab9b89070afc2a5bf979: |
10 | 10 | ||
11 | hw/core/machine: Add compat for x-migration-multifd-transfer VFIO property (2025-03-06 06:47:34 +0100) | 11 | vfio/pci: Enable AtomicOps completers on root ports (2023-07-10 09:52:52 +0200) |
12 | 12 | ||
13 | ---------------------------------------------------------------- | 13 | ---------------------------------------------------------------- |
14 | vfio queue: | 14 | vfio queue: |
15 | 15 | ||
16 | * Added property documentation | 16 | * Fixes in error handling paths of VFIO PCI devices |
17 | * Added Minor fixes | 17 | * Improvements of reported errors for VFIO migration |
18 | * Implemented basic PCI PM capability backing | 18 | * Linux header update |
19 | * Promoted new IGD maintainer | 19 | * Enablement of AtomicOps completers on root ports |
20 | * Deprecated vfio-plaform | 20 | * Fix for unplug of passthrough AP devices |
21 | * Extended VFIO migration with multifd support | ||
22 | 21 | ||
23 | ---------------------------------------------------------------- | 22 | ---------------------------------------------------------------- |
24 | Alex Williamson (5): | 23 | Alex Williamson (3): |
25 | hw/pci: Basic support for PCI power management | 24 | hw/vfio/pci-quirks: Sanitize capability pointer |
26 | pci: Use PCI PM capability initializer | 25 | pcie: Add a PCIe capability version helper |
27 | vfio/pci: Delete local pm_cap | 26 | vfio/pci: Enable AtomicOps completers on root ports |
28 | pcie, virtio: Remove redundant pm_cap | ||
29 | hw/vfio/pci: Re-order pre-reset | ||
30 | 27 | ||
31 | Cédric Le Goater (2): | 28 | Avihai Horon (1): |
32 | vfio: Add property documentation | 29 | vfio: Fix null pointer dereference bug in vfio_bars_finalize() |
33 | vfio/ccw: Replace warn_once_pfch() with warn_report_once() | ||
34 | 30 | ||
35 | Eric Auger (1): | 31 | Cédric Le Goater (1): |
36 | vfio-platform: Deprecate all forms of vfio-platform devices | 32 | linux-headers: update to v6.5-rc1 |
37 | 33 | ||
38 | Maciej S. Szmigiero (32): | 34 | Tony Krowiak (1): |
39 | migration: Clarify that {load, save}_cleanup handlers can run without setup | 35 | s390x/ap: Wire up the device request notifier interface |
40 | thread-pool: Remove thread_pool_submit() function | ||
41 | thread-pool: Rename AIO pool functions to *_aio() and data types to *Aio | ||
42 | thread-pool: Implement generic (non-AIO) pool support | ||
43 | migration: Add MIG_CMD_SWITCHOVER_START and its load handler | ||
44 | migration: Add qemu_loadvm_load_state_buffer() and its handler | ||
45 | migration: Always take BQL for migration_incoming_state_destroy() | ||
46 | error: define g_autoptr() cleanup function for the Error type | ||
47 | migration: Add thread pool of optional load threads | ||
48 | migration/multifd: Split packet into header and RAM data | ||
49 | migration/multifd: Device state transfer support - receive side | ||
50 | migration/multifd: Make multifd_send() thread safe | ||
51 | migration/multifd: Add an explicit MultiFDSendData destructor | ||
52 | migration/multifd: Device state transfer support - send side | ||
53 | migration/multifd: Add multifd_device_state_supported() | ||
54 | migration: Add save_live_complete_precopy_thread handler | ||
55 | vfio/migration: Add load_device_config_state_start trace event | ||
56 | vfio/migration: Convert bytes_transferred counter to atomic | ||
57 | vfio/migration: Add vfio_add_bytes_transferred() | ||
58 | vfio/migration: Move migration channel flags to vfio-common.h header file | ||
59 | vfio/migration: Multifd device state transfer support - basic types | ||
60 | vfio/migration: Multifd device state transfer - add support checking function | ||
61 | vfio/migration: Multifd setup/cleanup functions and associated VFIOMultifd | ||
62 | vfio/migration: Setup and cleanup multifd transfer in these general methods | ||
63 | vfio/migration: Multifd device state transfer support - received buffers queuing | ||
64 | vfio/migration: Multifd device state transfer support - load thread | ||
65 | migration/qemu-file: Define g_autoptr() cleanup function for QEMUFile | ||
66 | vfio/migration: Multifd device state transfer support - config loading support | ||
67 | vfio/migration: Multifd device state transfer support - send side | ||
68 | vfio/migration: Add x-migration-multifd-transfer VFIO property | ||
69 | vfio/migration: Make x-migration-multifd-transfer VFIO property mutable | ||
70 | hw/core/machine: Add compat for x-migration-multifd-transfer VFIO property | ||
71 | 36 | ||
72 | Peter Xu (1): | 37 | Zhenzhong Duan (5): |
73 | migration/multifd: Make MultiFDSendData a struct | 38 | vfio/pci: Disable INTx in vfio_realize error path |
39 | vfio/migration: Change vIOMMU blocker from global to per device | ||
40 | vfio/migration: Free resources when vfio_migration_realize fails | ||
41 | vfio/migration: Remove print of "Migration disabled" | ||
42 | vfio/migration: Return bool type for vfio_migration_realize() | ||
74 | 43 | ||
75 | Tomita Moeko (1): | 44 | Changes in v2: |
76 | MAINTAINERS: Add myself as vfio-igd maintainer | ||
77 | 45 | ||
78 | MAINTAINERS | 9 +- | 46 | Fixed broken S-o-b in "linux-headers: update to v6.5-rc1" commit |
79 | docs/about/deprecated.rst | 25 ++ | ||
80 | docs/devel/migration/vfio.rst | 45 ++- | ||
81 | hw/vfio/migration-multifd.h | 34 ++ | ||
82 | hw/vfio/pci.h | 1 - | ||
83 | include/block/aio.h | 8 +- | ||
84 | include/block/thread-pool.h | 62 +++- | ||
85 | include/hw/pci/pci.h | 3 + | ||
86 | include/hw/pci/pci_device.h | 3 + | ||
87 | include/hw/pci/pcie.h | 2 - | ||
88 | include/hw/vfio/vfio-common.h | 31 ++ | ||
89 | include/migration/client-options.h | 4 + | ||
90 | include/migration/misc.h | 25 ++ | ||
91 | include/migration/register.h | 52 ++- | ||
92 | include/qapi/error.h | 2 + | ||
93 | include/qemu/typedefs.h | 5 + | ||
94 | migration/migration.h | 7 + | ||
95 | migration/multifd.h | 74 +++- | ||
96 | migration/qemu-file.h | 2 + | ||
97 | migration/savevm.h | 6 +- | ||
98 | hw/core/machine.c | 2 + | ||
99 | hw/net/e1000e.c | 3 +- | ||
100 | hw/net/eepro100.c | 4 +- | ||
101 | hw/net/igb.c | 3 +- | ||
102 | hw/nvme/ctrl.c | 3 +- | ||
103 | hw/pci-bridge/pcie_pci_bridge.c | 3 +- | ||
104 | hw/pci/pci.c | 93 ++++- | ||
105 | hw/vfio/amd-xgbe.c | 2 + | ||
106 | hw/vfio/ap.c | 9 + | ||
107 | hw/vfio/calxeda-xgmac.c | 2 + | ||
108 | hw/vfio/ccw.c | 27 +- | ||
109 | hw/vfio/migration-multifd.c | 679 +++++++++++++++++++++++++++++++++++++ | ||
110 | hw/vfio/migration.c | 106 ++++-- | ||
111 | hw/vfio/pci.c | 180 +++++++++- | ||
112 | hw/vfio/platform.c | 25 ++ | ||
113 | hw/virtio/virtio-pci.c | 11 +- | ||
114 | migration/colo.c | 3 + | ||
115 | migration/migration-hmp-cmds.c | 2 + | ||
116 | migration/migration.c | 17 +- | ||
117 | migration/multifd-device-state.c | 212 ++++++++++++ | ||
118 | migration/multifd-nocomp.c | 30 +- | ||
119 | migration/multifd.c | 248 +++++++++++--- | ||
120 | migration/options.c | 9 + | ||
121 | migration/savevm.c | 201 ++++++++++- | ||
122 | tests/unit/test-thread-pool.c | 6 +- | ||
123 | util/async.c | 6 +- | ||
124 | util/thread-pool.c | 184 ++++++++-- | ||
125 | hw/pci/trace-events | 2 + | ||
126 | hw/vfio/meson.build | 1 + | ||
127 | hw/vfio/trace-events | 13 +- | ||
128 | migration/meson.build | 1 + | ||
129 | migration/trace-events | 1 + | ||
130 | scripts/analyze-migration.py | 11 + | ||
131 | util/trace-events | 6 +- | ||
132 | 54 files changed, 2296 insertions(+), 209 deletions(-) | ||
133 | create mode 100644 hw/vfio/migration-multifd.h | ||
134 | create mode 100644 hw/vfio/migration-multifd.c | ||
135 | create mode 100644 migration/multifd-device-state.c | ||
136 | 47 | ||
48 | hw/vfio/pci.h | 1 + | ||
49 | include/hw/pci/pcie.h | 1 + | ||
50 | include/hw/vfio/vfio-common.h | 5 +- | ||
51 | include/standard-headers/drm/drm_fourcc.h | 43 ++++++++ | ||
52 | include/standard-headers/linux/const.h | 2 +- | ||
53 | include/standard-headers/linux/pci_regs.h | 1 + | ||
54 | include/standard-headers/linux/vhost_types.h | 16 +++ | ||
55 | include/standard-headers/linux/virtio_blk.h | 18 ++-- | ||
56 | include/standard-headers/linux/virtio_config.h | 6 ++ | ||
57 | include/standard-headers/linux/virtio_net.h | 1 + | ||
58 | linux-headers/asm-arm64/bitsperlong.h | 23 ----- | ||
59 | linux-headers/asm-arm64/kvm.h | 33 ++++++ | ||
60 | linux-headers/asm-generic/bitsperlong.h | 13 ++- | ||
61 | linux-headers/asm-generic/unistd.h | 134 +++++++------------------ | ||
62 | linux-headers/asm-mips/unistd_n32.h | 1 + | ||
63 | linux-headers/asm-mips/unistd_n64.h | 1 + | ||
64 | linux-headers/asm-mips/unistd_o32.h | 1 + | ||
65 | linux-headers/asm-powerpc/unistd_32.h | 1 + | ||
66 | linux-headers/asm-powerpc/unistd_64.h | 1 + | ||
67 | linux-headers/asm-riscv/bitsperlong.h | 13 --- | ||
68 | linux-headers/asm-riscv/kvm.h | 134 ++++++++++++++++++++++++- | ||
69 | linux-headers/asm-riscv/unistd.h | 9 ++ | ||
70 | linux-headers/asm-s390/unistd_32.h | 2 + | ||
71 | linux-headers/asm-s390/unistd_64.h | 2 + | ||
72 | linux-headers/asm-x86/kvm.h | 3 + | ||
73 | linux-headers/asm-x86/unistd_32.h | 1 + | ||
74 | linux-headers/asm-x86/unistd_64.h | 1 + | ||
75 | linux-headers/asm-x86/unistd_x32.h | 1 + | ||
76 | linux-headers/linux/const.h | 2 +- | ||
77 | linux-headers/linux/kvm.h | 18 +++- | ||
78 | linux-headers/linux/mman.h | 14 +++ | ||
79 | linux-headers/linux/psp-sev.h | 7 ++ | ||
80 | linux-headers/linux/userfaultfd.h | 17 +++- | ||
81 | linux-headers/linux/vfio.h | 27 +++++ | ||
82 | linux-headers/linux/vhost.h | 31 ++++++ | ||
83 | hw/pci/pcie.c | 7 ++ | ||
84 | hw/vfio/ap.c | 113 +++++++++++++++++++++ | ||
85 | hw/vfio/common.c | 51 +--------- | ||
86 | hw/vfio/migration.c | 51 +++++++--- | ||
87 | hw/vfio/pci-quirks.c | 10 +- | ||
88 | hw/vfio/pci.c | 91 ++++++++++++++++- | ||
89 | 41 files changed, 678 insertions(+), 229 deletions(-) | ||
137 | 90 | diff view generated by jsdifflib |
Deleted patch | |||
---|---|---|---|
1 | Investigate the git history to uncover when and why the VFIO | ||
2 | properties were introduced and update the models. This is mostly | ||
3 | targeting vfio-pci device, since vfio-platform, vfio-ap and vfio-ccw | ||
4 | devices are simpler. | ||
5 | 1 | ||
6 | Sort the properties based on the QEMU version in which they were | ||
7 | introduced. | ||
8 | |||
9 | Cc: Tony Krowiak <akrowiak@linux.ibm.com> | ||
10 | Cc: Eric Farman <farman@linux.ibm.com> | ||
11 | Cc: Eric Auger <eric.auger@redhat.com> | ||
12 | Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com> | ||
13 | Reviewed-by: Anthony Krowiak <akrowiak@linux.ibm.com> | ||
14 | Reviewed-by: Eric Farman <farman@linux.ibm.com> # vfio-ccw | ||
15 | Reviewed-by: Alex Williamson <alex.williamson@redhat.com> | ||
16 | Reviewed-by: Eric Auger <eric.auger@redhat.com> | ||
17 | Link: https://lore.kernel.org/qemu-devel/20250217173455.449983-1-clg@redhat.com | ||
18 | Signed-off-by: Cédric Le Goater <clg@redhat.com> | ||
19 | --- | ||
20 | hw/vfio/ap.c | 9 ++++ | ||
21 | hw/vfio/ccw.c | 15 ++++++ | ||
22 | hw/vfio/pci.c | 125 +++++++++++++++++++++++++++++++++++++++++++++ | ||
23 | hw/vfio/platform.c | 24 +++++++++ | ||
24 | 4 files changed, 173 insertions(+) | ||
25 | |||
26 | diff --git a/hw/vfio/ap.c b/hw/vfio/ap.c | ||
27 | index XXXXXXX..XXXXXXX 100644 | ||
28 | --- a/hw/vfio/ap.c | ||
29 | +++ b/hw/vfio/ap.c | ||
30 | @@ -XXX,XX +XXX,XX @@ static void vfio_ap_class_init(ObjectClass *klass, void *data) | ||
31 | dc->hotpluggable = true; | ||
32 | device_class_set_legacy_reset(dc, vfio_ap_reset); | ||
33 | dc->bus_type = TYPE_AP_BUS; | ||
34 | + | ||
35 | + object_class_property_set_description(klass, /* 3.1 */ | ||
36 | + "sysfsdev", | ||
37 | + "Host sysfs path of assigned device"); | ||
38 | +#ifdef CONFIG_IOMMUFD | ||
39 | + object_class_property_set_description(klass, /* 9.0 */ | ||
40 | + "iommufd", | ||
41 | + "Set host IOMMUFD backend device"); | ||
42 | +#endif | ||
43 | } | ||
44 | |||
45 | static const TypeInfo vfio_ap_info = { | ||
46 | diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c | ||
47 | index XXXXXXX..XXXXXXX 100644 | ||
48 | --- a/hw/vfio/ccw.c | ||
49 | +++ b/hw/vfio/ccw.c | ||
50 | @@ -XXX,XX +XXX,XX @@ static void vfio_ccw_class_init(ObjectClass *klass, void *data) | ||
51 | cdc->handle_halt = vfio_ccw_handle_halt; | ||
52 | cdc->handle_clear = vfio_ccw_handle_clear; | ||
53 | cdc->handle_store = vfio_ccw_handle_store; | ||
54 | + | ||
55 | + object_class_property_set_description(klass, /* 2.10 */ | ||
56 | + "sysfsdev", | ||
57 | + "Host sysfs path of assigned device"); | ||
58 | + object_class_property_set_description(klass, /* 3.0 */ | ||
59 | + "force-orb-pfch", | ||
60 | + "Force unlimited prefetch"); | ||
61 | +#ifdef CONFIG_IOMMUFD | ||
62 | + object_class_property_set_description(klass, /* 9.0 */ | ||
63 | + "iommufd", | ||
64 | + "Set host IOMMUFD backend device"); | ||
65 | +#endif | ||
66 | + object_class_property_set_description(klass, /* 9.2 */ | ||
67 | + "loadparm", | ||
68 | + "Define which devices that can be used for booting"); | ||
69 | } | ||
70 | |||
71 | static const TypeInfo vfio_ccw_info = { | ||
72 | diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c | ||
73 | index XXXXXXX..XXXXXXX 100644 | ||
74 | --- a/hw/vfio/pci.c | ||
75 | +++ b/hw/vfio/pci.c | ||
76 | @@ -XXX,XX +XXX,XX @@ static void vfio_pci_dev_class_init(ObjectClass *klass, void *data) | ||
77 | pdc->exit = vfio_exitfn; | ||
78 | pdc->config_read = vfio_pci_read_config; | ||
79 | pdc->config_write = vfio_pci_write_config; | ||
80 | + | ||
81 | + object_class_property_set_description(klass, /* 1.3 */ | ||
82 | + "host", | ||
83 | + "Host PCI address [domain:]<bus:slot.function> of assigned device"); | ||
84 | + object_class_property_set_description(klass, /* 1.3 */ | ||
85 | + "x-intx-mmap-timeout-ms", | ||
86 | + "When EOI is not provided by KVM/QEMU, wait time " | ||
87 | + "(milliseconds) to re-enable device direct access " | ||
88 | + "after INTx (DEBUG)"); | ||
89 | + object_class_property_set_description(klass, /* 1.5 */ | ||
90 | + "x-vga", | ||
91 | + "Expose VGA address spaces for device"); | ||
92 | + object_class_property_set_description(klass, /* 2.3 */ | ||
93 | + "x-req", | ||
94 | + "Disable device request notification support (DEBUG)"); | ||
95 | + object_class_property_set_description(klass, /* 2.4 and 2.5 */ | ||
96 | + "x-no-mmap", | ||
97 | + "Disable MMAP for device. Allows to trace MMIO " | ||
98 | + "accesses (DEBUG)"); | ||
99 | + object_class_property_set_description(klass, /* 2.5 */ | ||
100 | + "x-no-kvm-intx", | ||
101 | + "Disable direct VFIO->KVM INTx injection. Allows to " | ||
102 | + "trace INTx interrupts (DEBUG)"); | ||
103 | + object_class_property_set_description(klass, /* 2.5 */ | ||
104 | + "x-no-kvm-msi", | ||
105 | + "Disable direct VFIO->KVM MSI injection. Allows to " | ||
106 | + "trace MSI interrupts (DEBUG)"); | ||
107 | + object_class_property_set_description(klass, /* 2.5 */ | ||
108 | + "x-no-kvm-msix", | ||
109 | + "Disable direct VFIO->KVM MSIx injection. Allows to " | ||
110 | + "trace MSIx interrupts (DEBUG)"); | ||
111 | + object_class_property_set_description(klass, /* 2.5 */ | ||
112 | + "x-pci-vendor-id", | ||
113 | + "Override PCI Vendor ID with provided value (DEBUG)"); | ||
114 | + object_class_property_set_description(klass, /* 2.5 */ | ||
115 | + "x-pci-device-id", | ||
116 | + "Override PCI device ID with provided value (DEBUG)"); | ||
117 | + object_class_property_set_description(klass, /* 2.5 */ | ||
118 | + "x-pci-sub-vendor-id", | ||
119 | + "Override PCI Subsystem Vendor ID with provided value " | ||
120 | + "(DEBUG)"); | ||
121 | + object_class_property_set_description(klass, /* 2.5 */ | ||
122 | + "x-pci-sub-device-id", | ||
123 | + "Override PCI Subsystem Device ID with provided value " | ||
124 | + "(DEBUG)"); | ||
125 | + object_class_property_set_description(klass, /* 2.6 */ | ||
126 | + "sysfsdev", | ||
127 | + "Host sysfs path of assigned device"); | ||
128 | + object_class_property_set_description(klass, /* 2.7 */ | ||
129 | + "x-igd-opregion", | ||
130 | + "Expose host IGD OpRegion to guest"); | ||
131 | + object_class_property_set_description(klass, /* 2.7 (See c4c45e943e51) */ | ||
132 | + "x-igd-gms", | ||
133 | + "Override IGD data stolen memory size (32MiB units)"); | ||
134 | + object_class_property_set_description(klass, /* 2.11 */ | ||
135 | + "x-nv-gpudirect-clique", | ||
136 | + "Add NVIDIA GPUDirect capability indicating P2P DMA " | ||
137 | + "clique for device [0-15]"); | ||
138 | + object_class_property_set_description(klass, /* 2.12 */ | ||
139 | + "x-no-geforce-quirks", | ||
140 | + "Disable GeForce quirks (for NVIDIA Quadro/GRID/Tesla). " | ||
141 | + "Improves performance"); | ||
142 | + object_class_property_set_description(klass, /* 2.12 */ | ||
143 | + "display", | ||
144 | + "Enable display support for device, ex. vGPU"); | ||
145 | + object_class_property_set_description(klass, /* 2.12 */ | ||
146 | + "x-msix-relocation", | ||
147 | + "Specify MSI-X MMIO relocation to the end of specified " | ||
148 | + "existing BAR or new BAR to avoid virtualization overhead " | ||
149 | + "due to adjacent device registers"); | ||
150 | + object_class_property_set_description(klass, /* 3.0 */ | ||
151 | + "x-no-kvm-ioeventfd", | ||
152 | + "Disable registration of ioeventfds with KVM (DEBUG)"); | ||
153 | + object_class_property_set_description(klass, /* 3.0 */ | ||
154 | + "x-no-vfio-ioeventfd", | ||
155 | + "Disable linking of KVM ioeventfds to VFIO ioeventfds " | ||
156 | + "(DEBUG)"); | ||
157 | + object_class_property_set_description(klass, /* 3.1 */ | ||
158 | + "x-balloon-allowed", | ||
159 | + "Override allowing ballooning with device (DEBUG, DANGER)"); | ||
160 | + object_class_property_set_description(klass, /* 3.2 */ | ||
161 | + "xres", | ||
162 | + "Set X display resolution the vGPU should use"); | ||
163 | + object_class_property_set_description(klass, /* 3.2 */ | ||
164 | + "yres", | ||
165 | + "Set Y display resolution the vGPU should use"); | ||
166 | + object_class_property_set_description(klass, /* 5.2 */ | ||
167 | + "x-pre-copy-dirty-page-tracking", | ||
168 | + "Disable dirty pages tracking during iterative phase " | ||
169 | + "(DEBUG)"); | ||
170 | + object_class_property_set_description(klass, /* 5.2, 8.0 non-experimetal */ | ||
171 | + "enable-migration", | ||
172 | + "Enale device migration. Also requires a host VFIO PCI " | ||
173 | + "variant or mdev driver with migration support enabled"); | ||
174 | + object_class_property_set_description(klass, /* 8.1 */ | ||
175 | + "vf-token", | ||
176 | + "Specify UUID VF token. Required for VF when PF is owned " | ||
177 | + "by another VFIO driver"); | ||
178 | +#ifdef CONFIG_IOMMUFD | ||
179 | + object_class_property_set_description(klass, /* 9.0 */ | ||
180 | + "iommufd", | ||
181 | + "Set host IOMMUFD backend device"); | ||
182 | +#endif | ||
183 | + object_class_property_set_description(klass, /* 9.1 */ | ||
184 | + "x-device-dirty-page-tracking", | ||
185 | + "Disable device dirty page tracking and use " | ||
186 | + "container-based dirty page tracking (DEBUG)"); | ||
187 | + object_class_property_set_description(klass, /* 9.1 */ | ||
188 | + "migration-events", | ||
189 | + "Emit VFIO migration QAPI event when a VFIO device " | ||
190 | + "changes its migration state. For management applications"); | ||
191 | + object_class_property_set_description(klass, /* 9.1 */ | ||
192 | + "skip-vsc-check", | ||
193 | + "Skip config space check for Vendor Specific Capability. " | ||
194 | + "Setting to false will enforce strict checking of VSC content " | ||
195 | + "(DEBUG)"); | ||
196 | } | ||
197 | |||
198 | static const TypeInfo vfio_pci_dev_info = { | ||
199 | @@ -XXX,XX +XXX,XX @@ static void vfio_pci_nohotplug_dev_class_init(ObjectClass *klass, void *data) | ||
200 | |||
201 | device_class_set_props(dc, vfio_pci_dev_nohotplug_properties); | ||
202 | dc->hotpluggable = false; | ||
203 | + | ||
204 | + object_class_property_set_description(klass, /* 3.1 */ | ||
205 | + "ramfb", | ||
206 | + "Enable ramfb to provide pre-boot graphics for devices " | ||
207 | + "enabling display option"); | ||
208 | + object_class_property_set_description(klass, /* 8.2 */ | ||
209 | + "x-ramfb-migrate", | ||
210 | + "Override default migration support for ramfb support " | ||
211 | + "(DEBUG)"); | ||
212 | } | ||
213 | |||
214 | static const TypeInfo vfio_pci_nohotplug_dev_info = { | ||
215 | diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c | ||
216 | index XXXXXXX..XXXXXXX 100644 | ||
217 | --- a/hw/vfio/platform.c | ||
218 | +++ b/hw/vfio/platform.c | ||
219 | @@ -XXX,XX +XXX,XX @@ static void vfio_platform_class_init(ObjectClass *klass, void *data) | ||
220 | dc->desc = "VFIO-based platform device assignment"; | ||
221 | sbc->connect_irq_notifier = vfio_start_irqfd_injection; | ||
222 | set_bit(DEVICE_CATEGORY_MISC, dc->categories); | ||
223 | + | ||
224 | + object_class_property_set_description(klass, /* 2.4 */ | ||
225 | + "host", | ||
226 | + "Host device name of assigned device"); | ||
227 | + object_class_property_set_description(klass, /* 2.4 and 2.5 */ | ||
228 | + "x-no-mmap", | ||
229 | + "Disable MMAP for device. Allows to trace MMIO " | ||
230 | + "accesses (DEBUG)"); | ||
231 | + object_class_property_set_description(klass, /* 2.4 */ | ||
232 | + "mmap-timeout-ms", | ||
233 | + "When EOI is not provided by KVM/QEMU, wait time " | ||
234 | + "(milliseconds) to re-enable device direct access " | ||
235 | + "after level interrupt (DEBUG)"); | ||
236 | + object_class_property_set_description(klass, /* 2.4 */ | ||
237 | + "x-irqfd", | ||
238 | + "Allow disabling irqfd support (DEBUG)"); | ||
239 | + object_class_property_set_description(klass, /* 2.6 */ | ||
240 | + "sysfsdev", | ||
241 | + "Host sysfs path of assigned device"); | ||
242 | +#ifdef CONFIG_IOMMUFD | ||
243 | + object_class_property_set_description(klass, /* 9.0 */ | ||
244 | + "iommufd", | ||
245 | + "Set host IOMMUFD backend device"); | ||
246 | +#endif | ||
247 | } | ||
248 | |||
249 | static const TypeInfo vfio_platform_dev_info = { | ||
250 | -- | ||
251 | 2.48.1 | ||
252 | |||
253 | diff view generated by jsdifflib |
Deleted patch | |||
---|---|---|---|
1 | Use the common helper warn_report_once() instead of implementing its | ||
2 | own. | ||
3 | 1 | ||
4 | Cc: Eric Farman <farman@linux.ibm.com> | ||
5 | Reviewed-by: Eric Farman <farman@linux.ibm.com> | ||
6 | Link: https://lore.kernel.org/qemu-devel/20250214161936.1720039-1-clg@redhat.com | ||
7 | Signed-off-by: Cédric Le Goater <clg@redhat.com> | ||
8 | --- | ||
9 | hw/vfio/ccw.c | 12 ++---------- | ||
10 | 1 file changed, 2 insertions(+), 10 deletions(-) | ||
11 | |||
12 | diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c | ||
13 | index XXXXXXX..XXXXXXX 100644 | ||
14 | --- a/hw/vfio/ccw.c | ||
15 | +++ b/hw/vfio/ccw.c | ||
16 | @@ -XXX,XX +XXX,XX @@ struct VFIOCCWDevice { | ||
17 | EventNotifier crw_notifier; | ||
18 | EventNotifier req_notifier; | ||
19 | bool force_orb_pfch; | ||
20 | - bool warned_orb_pfch; | ||
21 | }; | ||
22 | |||
23 | -static inline void warn_once_pfch(VFIOCCWDevice *vcdev, SubchDev *sch, | ||
24 | - const char *msg) | ||
25 | -{ | ||
26 | - warn_report_once_cond(&vcdev->warned_orb_pfch, | ||
27 | - "vfio-ccw (devno %x.%x.%04x): %s", | ||
28 | - sch->cssid, sch->ssid, sch->devno, msg); | ||
29 | -} | ||
30 | - | ||
31 | static void vfio_ccw_compute_needs_reset(VFIODevice *vdev) | ||
32 | { | ||
33 | vdev->needs_reset = false; | ||
34 | @@ -XXX,XX +XXX,XX @@ static IOInstEnding vfio_ccw_handle_request(SubchDev *sch) | ||
35 | |||
36 | if (!(sch->orb.ctrl0 & ORB_CTRL0_MASK_PFCH) && vcdev->force_orb_pfch) { | ||
37 | sch->orb.ctrl0 |= ORB_CTRL0_MASK_PFCH; | ||
38 | - warn_once_pfch(vcdev, sch, "PFCH flag forced"); | ||
39 | + warn_report_once("vfio-ccw (devno %x.%x.%04x): PFCH flag forced", | ||
40 | + sch->cssid, sch->ssid, sch->devno); | ||
41 | } | ||
42 | |||
43 | QEMU_BUILD_BUG_ON(sizeof(region->orb_area) != sizeof(ORB)); | ||
44 | -- | ||
45 | 2.48.1 | ||
46 | |||
47 | diff view generated by jsdifflib |
Deleted patch | |||
---|---|---|---|
1 | From: Alex Williamson <alex.williamson@redhat.com> | ||
2 | 1 | ||
3 | The memory and IO BARs for devices are only accessible in the D0 power | ||
4 | state. In other power states the PCI spec defines that the device | ||
5 | responds to TLPs and messages with an Unsupported Request response. | ||
6 | |||
7 | To approximate this behavior, consider the BARs as unmapped when the | ||
8 | device is not in the D0 power state. This makes the BARs inaccessible | ||
9 | and has the additional bonus for vfio-pci that we don't attempt to DMA | ||
10 | map BARs for devices in a non-D0 power state. | ||
11 | |||
12 | To support this, an interface is added for devices to register the PM | ||
13 | capability, which allows central tracking to enforce valid transitions | ||
14 | and unmap BARs in non-D0 states. | ||
15 | |||
16 | NB. We currently have device models (eepro100 and pcie_pci_bridge) | ||
17 | that register a PM capability but do not set wmask to enable writes to | ||
18 | the power state field. In order to maintain migration compatibility, | ||
19 | this new helper does not manage the wmask to enable guest writes to | ||
20 | initiate a power state change. The contents and write access of the | ||
21 | PM capability are still managed by the caller. | ||
22 | |||
23 | Cc: Michael S. Tsirkin <mst@redhat.com> | ||
24 | Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> | ||
25 | Signed-off-by: Alex Williamson <alex.williamson@redhat.com> | ||
26 | Reviewed-by: Eric Auger <eric.auger@redhat.com> | ||
27 | Reviewed-by: Michael S. Tsirkin <mst@redhat.com> | ||
28 | Link: https://lore.kernel.org/qemu-devel/20250225215237.3314011-2-alex.williamson@redhat.com | ||
29 | Signed-off-by: Cédric Le Goater <clg@redhat.com> | ||
30 | --- | ||
31 | include/hw/pci/pci.h | 3 ++ | ||
32 | include/hw/pci/pci_device.h | 3 ++ | ||
33 | hw/pci/pci.c | 93 ++++++++++++++++++++++++++++++++++++- | ||
34 | hw/pci/trace-events | 2 + | ||
35 | 4 files changed, 99 insertions(+), 2 deletions(-) | ||
36 | |||
37 | diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h | ||
38 | index XXXXXXX..XXXXXXX 100644 | ||
39 | --- a/include/hw/pci/pci.h | ||
40 | +++ b/include/hw/pci/pci.h | ||
41 | @@ -XXX,XX +XXX,XX @@ enum { | ||
42 | QEMU_PCIE_ARI_NEXTFN_1 = (1 << QEMU_PCIE_ARI_NEXTFN_1_BITNR), | ||
43 | #define QEMU_PCIE_EXT_TAG_BITNR 13 | ||
44 | QEMU_PCIE_EXT_TAG = (1 << QEMU_PCIE_EXT_TAG_BITNR), | ||
45 | +#define QEMU_PCI_CAP_PM_BITNR 14 | ||
46 | + QEMU_PCI_CAP_PM = (1 << QEMU_PCI_CAP_PM_BITNR), | ||
47 | }; | ||
48 | |||
49 | typedef struct PCIINTxRoute { | ||
50 | @@ -XXX,XX +XXX,XX @@ static inline void pci_irq_deassert(PCIDevice *pci_dev) | ||
51 | MSIMessage pci_get_msi_message(PCIDevice *dev, int vector); | ||
52 | void pci_set_enabled(PCIDevice *pci_dev, bool state); | ||
53 | void pci_set_power(PCIDevice *pci_dev, bool state); | ||
54 | +int pci_pm_init(PCIDevice *pci_dev, uint8_t offset, Error **errp); | ||
55 | |||
56 | #endif | ||
57 | diff --git a/include/hw/pci/pci_device.h b/include/hw/pci/pci_device.h | ||
58 | index XXXXXXX..XXXXXXX 100644 | ||
59 | --- a/include/hw/pci/pci_device.h | ||
60 | +++ b/include/hw/pci/pci_device.h | ||
61 | @@ -XXX,XX +XXX,XX @@ struct PCIDevice { | ||
62 | /* Capability bits */ | ||
63 | uint32_t cap_present; | ||
64 | |||
65 | + /* Offset of PM capability in config space */ | ||
66 | + uint8_t pm_cap; | ||
67 | + | ||
68 | /* Offset of MSI-X capability in config space */ | ||
69 | uint8_t msix_cap; | ||
70 | |||
71 | diff --git a/hw/pci/pci.c b/hw/pci/pci.c | ||
72 | index XXXXXXX..XXXXXXX 100644 | ||
73 | --- a/hw/pci/pci.c | ||
74 | +++ b/hw/pci/pci.c | ||
75 | @@ -XXX,XX +XXX,XX @@ static void pci_msi_trigger(PCIDevice *dev, MSIMessage msg) | ||
76 | attrs, NULL); | ||
77 | } | ||
78 | |||
79 | +/* | ||
80 | + * Register and track a PM capability. If wmask is also enabled for the power | ||
81 | + * state field of the pmcsr register, guest writes may change the device PM | ||
82 | + * state. BAR access is only enabled while the device is in the D0 state. | ||
83 | + * Return the capability offset or negative error code. | ||
84 | + */ | ||
85 | +int pci_pm_init(PCIDevice *d, uint8_t offset, Error **errp) | ||
86 | +{ | ||
87 | + int cap = pci_add_capability(d, PCI_CAP_ID_PM, offset, PCI_PM_SIZEOF, errp); | ||
88 | + | ||
89 | + if (cap < 0) { | ||
90 | + return cap; | ||
91 | + } | ||
92 | + | ||
93 | + d->pm_cap = cap; | ||
94 | + d->cap_present |= QEMU_PCI_CAP_PM; | ||
95 | + | ||
96 | + return cap; | ||
97 | +} | ||
98 | + | ||
99 | +static uint8_t pci_pm_state(PCIDevice *d) | ||
100 | +{ | ||
101 | + uint16_t pmcsr; | ||
102 | + | ||
103 | + if (!(d->cap_present & QEMU_PCI_CAP_PM)) { | ||
104 | + return 0; | ||
105 | + } | ||
106 | + | ||
107 | + pmcsr = pci_get_word(d->config + d->pm_cap + PCI_PM_CTRL); | ||
108 | + | ||
109 | + return pmcsr & PCI_PM_CTRL_STATE_MASK; | ||
110 | +} | ||
111 | + | ||
112 | +/* | ||
113 | + * Update the PM capability state based on the new value stored in config | ||
114 | + * space respective to the old, pre-write state provided. If the new value | ||
115 | + * is rejected (unsupported or invalid transition) restore the old value. | ||
116 | + * Return the resulting PM state. | ||
117 | + */ | ||
118 | +static uint8_t pci_pm_update(PCIDevice *d, uint32_t addr, int l, uint8_t old) | ||
119 | +{ | ||
120 | + uint16_t pmc; | ||
121 | + uint8_t new; | ||
122 | + | ||
123 | + if (!(d->cap_present & QEMU_PCI_CAP_PM) || | ||
124 | + !range_covers_byte(addr, l, d->pm_cap + PCI_PM_CTRL)) { | ||
125 | + return old; | ||
126 | + } | ||
127 | + | ||
128 | + new = pci_pm_state(d); | ||
129 | + if (new == old) { | ||
130 | + return old; | ||
131 | + } | ||
132 | + | ||
133 | + pmc = pci_get_word(d->config + d->pm_cap + PCI_PM_PMC); | ||
134 | + | ||
135 | + /* | ||
136 | + * Transitions to D1 & D2 are only allowed if supported. Devices may | ||
137 | + * only transition to higher D-states or to D0. | ||
138 | + */ | ||
139 | + if ((!(pmc & PCI_PM_CAP_D1) && new == 1) || | ||
140 | + (!(pmc & PCI_PM_CAP_D2) && new == 2) || | ||
141 | + (old && new && new < old)) { | ||
142 | + pci_word_test_and_clear_mask(d->config + d->pm_cap + PCI_PM_CTRL, | ||
143 | + PCI_PM_CTRL_STATE_MASK); | ||
144 | + pci_word_test_and_set_mask(d->config + d->pm_cap + PCI_PM_CTRL, | ||
145 | + old); | ||
146 | + trace_pci_pm_bad_transition(d->name, pci_dev_bus_num(d), | ||
147 | + PCI_SLOT(d->devfn), PCI_FUNC(d->devfn), | ||
148 | + old, new); | ||
149 | + return old; | ||
150 | + } | ||
151 | + | ||
152 | + trace_pci_pm_transition(d->name, pci_dev_bus_num(d), PCI_SLOT(d->devfn), | ||
153 | + PCI_FUNC(d->devfn), old, new); | ||
154 | + return new; | ||
155 | +} | ||
156 | + | ||
157 | static void pci_reset_regions(PCIDevice *dev) | ||
158 | { | ||
159 | int r; | ||
160 | @@ -XXX,XX +XXX,XX @@ static void pci_do_device_reset(PCIDevice *dev) | ||
161 | pci_get_word(dev->wmask + PCI_INTERRUPT_LINE) | | ||
162 | pci_get_word(dev->w1cmask + PCI_INTERRUPT_LINE)); | ||
163 | dev->config[PCI_CACHE_LINE_SIZE] = 0x0; | ||
164 | + /* Default PM state is D0 */ | ||
165 | + if (dev->cap_present & QEMU_PCI_CAP_PM) { | ||
166 | + pci_word_test_and_clear_mask(dev->config + dev->pm_cap + PCI_PM_CTRL, | ||
167 | + PCI_PM_CTRL_STATE_MASK); | ||
168 | + } | ||
169 | pci_reset_regions(dev); | ||
170 | pci_update_mappings(dev); | ||
171 | |||
172 | @@ -XXX,XX +XXX,XX @@ static void pci_update_mappings(PCIDevice *d) | ||
173 | continue; | ||
174 | |||
175 | new_addr = pci_bar_address(d, i, r->type, r->size); | ||
176 | - if (!d->enabled) { | ||
177 | + if (!d->enabled || pci_pm_state(d)) { | ||
178 | new_addr = PCI_BAR_UNMAPPED; | ||
179 | } | ||
180 | |||
181 | @@ -XXX,XX +XXX,XX @@ uint32_t pci_default_read_config(PCIDevice *d, | ||
182 | |||
183 | void pci_default_write_config(PCIDevice *d, uint32_t addr, uint32_t val_in, int l) | ||
184 | { | ||
185 | + uint8_t new_pm_state, old_pm_state = pci_pm_state(d); | ||
186 | int i, was_irq_disabled = pci_irq_disabled(d); | ||
187 | uint32_t val = val_in; | ||
188 | |||
189 | @@ -XXX,XX +XXX,XX @@ void pci_default_write_config(PCIDevice *d, uint32_t addr, uint32_t val_in, int | ||
190 | d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val & wmask); | ||
191 | d->config[addr + i] &= ~(val & w1cmask); /* W1C: Write 1 to Clear */ | ||
192 | } | ||
193 | + | ||
194 | + new_pm_state = pci_pm_update(d, addr, l, old_pm_state); | ||
195 | + | ||
196 | if (ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) || | ||
197 | ranges_overlap(addr, l, PCI_ROM_ADDRESS, 4) || | ||
198 | ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4) || | ||
199 | - range_covers_byte(addr, l, PCI_COMMAND)) | ||
200 | + range_covers_byte(addr, l, PCI_COMMAND) || | ||
201 | + !!new_pm_state != !!old_pm_state) { | ||
202 | pci_update_mappings(d); | ||
203 | + } | ||
204 | |||
205 | if (ranges_overlap(addr, l, PCI_COMMAND, 2)) { | ||
206 | pci_update_irq_disabled(d, was_irq_disabled); | ||
207 | diff --git a/hw/pci/trace-events b/hw/pci/trace-events | ||
208 | index XXXXXXX..XXXXXXX 100644 | ||
209 | --- a/hw/pci/trace-events | ||
210 | +++ b/hw/pci/trace-events | ||
211 | @@ -XXX,XX +XXX,XX @@ | ||
212 | # See docs/devel/tracing.rst for syntax documentation. | ||
213 | |||
214 | # pci.c | ||
215 | +pci_pm_bad_transition(const char *dev, uint32_t bus, uint32_t slot, uint32_t func, uint8_t old, uint8_t new) "%s %02x:%02x.%x REJECTED PM transition D%d->D%d" | ||
216 | +pci_pm_transition(const char *dev, uint32_t bus, uint32_t slot, uint32_t func, uint8_t old, uint8_t new) "%s %02x:%02x.%x PM transition D%d->D%d" | ||
217 | pci_update_mappings_del(const char *dev, uint32_t bus, uint32_t slot, uint32_t func, int bar, uint64_t addr, uint64_t size) "%s %02x:%02x.%x %d,0x%"PRIx64"+0x%"PRIx64 | ||
218 | pci_update_mappings_add(const char *dev, uint32_t bus, uint32_t slot, uint32_t func, int bar, uint64_t addr, uint64_t size) "%s %02x:%02x.%x %d,0x%"PRIx64"+0x%"PRIx64 | ||
219 | pci_route_irq(int dev_irq, const char *dev_path, int parent_irq, const char *parent_path) "IRQ %d @%s -> IRQ %d @%s" | ||
220 | -- | ||
221 | 2.48.1 | ||
222 | |||
223 | diff view generated by jsdifflib |
Deleted patch | |||
---|---|---|---|
1 | From: Alex Williamson <alex.williamson@redhat.com> | ||
2 | 1 | ||
3 | Switch callers directly initializing the PCI PM capability with | ||
4 | pci_add_capability() to use pci_pm_init(). | ||
5 | |||
6 | Cc: Dmitry Fleytman <dmitry.fleytman@gmail.com> | ||
7 | Cc: Akihiko Odaki <akihiko.odaki@daynix.com> | ||
8 | Cc: Jason Wang <jasowang@redhat.com> | ||
9 | Cc: Stefan Weil <sw@weilnetz.de> | ||
10 | Cc: Sriram Yagnaraman <sriram.yagnaraman@ericsson.com> | ||
11 | Cc: Keith Busch <kbusch@kernel.org> | ||
12 | Cc: Klaus Jensen <its@irrelevant.dk> | ||
13 | Cc: Jesper Devantier <foss@defmacro.it> | ||
14 | Cc: Michael S. Tsirkin <mst@redhat.com> | ||
15 | Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> | ||
16 | Cc: Cédric Le Goater <clg@redhat.com> | ||
17 | Signed-off-by: Alex Williamson <alex.williamson@redhat.com> | ||
18 | Reviewed-by: Eric Auger <eric.auger@redhat.com> | ||
19 | Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com> | ||
20 | Reviewed-by: Michael S. Tsirkin <mst@redhat.com> | ||
21 | Link: https://lore.kernel.org/qemu-devel/20250225215237.3314011-3-alex.williamson@redhat.com | ||
22 | Signed-off-by: Cédric Le Goater <clg@redhat.com> | ||
23 | --- | ||
24 | hw/net/e1000e.c | 3 +-- | ||
25 | hw/net/eepro100.c | 4 +--- | ||
26 | hw/net/igb.c | 3 +-- | ||
27 | hw/nvme/ctrl.c | 3 +-- | ||
28 | hw/pci-bridge/pcie_pci_bridge.c | 2 +- | ||
29 | hw/vfio/pci.c | 7 ++++++- | ||
30 | hw/virtio/virtio-pci.c | 3 +-- | ||
31 | 7 files changed, 12 insertions(+), 13 deletions(-) | ||
32 | |||
33 | diff --git a/hw/net/e1000e.c b/hw/net/e1000e.c | ||
34 | index XXXXXXX..XXXXXXX 100644 | ||
35 | --- a/hw/net/e1000e.c | ||
36 | +++ b/hw/net/e1000e.c | ||
37 | @@ -XXX,XX +XXX,XX @@ static int | ||
38 | e1000e_add_pm_capability(PCIDevice *pdev, uint8_t offset, uint16_t pmc) | ||
39 | { | ||
40 | Error *local_err = NULL; | ||
41 | - int ret = pci_add_capability(pdev, PCI_CAP_ID_PM, offset, | ||
42 | - PCI_PM_SIZEOF, &local_err); | ||
43 | + int ret = pci_pm_init(pdev, offset, &local_err); | ||
44 | |||
45 | if (local_err) { | ||
46 | error_report_err(local_err); | ||
47 | diff --git a/hw/net/eepro100.c b/hw/net/eepro100.c | ||
48 | index XXXXXXX..XXXXXXX 100644 | ||
49 | --- a/hw/net/eepro100.c | ||
50 | +++ b/hw/net/eepro100.c | ||
51 | @@ -XXX,XX +XXX,XX @@ static void e100_pci_reset(EEPRO100State *s, Error **errp) | ||
52 | if (info->power_management) { | ||
53 | /* Power Management Capabilities */ | ||
54 | int cfg_offset = 0xdc; | ||
55 | - int r = pci_add_capability(&s->dev, PCI_CAP_ID_PM, | ||
56 | - cfg_offset, PCI_PM_SIZEOF, | ||
57 | - errp); | ||
58 | + int r = pci_pm_init(&s->dev, cfg_offset, errp); | ||
59 | if (r < 0) { | ||
60 | return; | ||
61 | } | ||
62 | diff --git a/hw/net/igb.c b/hw/net/igb.c | ||
63 | index XXXXXXX..XXXXXXX 100644 | ||
64 | --- a/hw/net/igb.c | ||
65 | +++ b/hw/net/igb.c | ||
66 | @@ -XXX,XX +XXX,XX @@ static int | ||
67 | igb_add_pm_capability(PCIDevice *pdev, uint8_t offset, uint16_t pmc) | ||
68 | { | ||
69 | Error *local_err = NULL; | ||
70 | - int ret = pci_add_capability(pdev, PCI_CAP_ID_PM, offset, | ||
71 | - PCI_PM_SIZEOF, &local_err); | ||
72 | + int ret = pci_pm_init(pdev, offset, &local_err); | ||
73 | |||
74 | if (local_err) { | ||
75 | error_report_err(local_err); | ||
76 | diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c | ||
77 | index XXXXXXX..XXXXXXX 100644 | ||
78 | --- a/hw/nvme/ctrl.c | ||
79 | +++ b/hw/nvme/ctrl.c | ||
80 | @@ -XXX,XX +XXX,XX @@ static int nvme_add_pm_capability(PCIDevice *pci_dev, uint8_t offset) | ||
81 | Error *err = NULL; | ||
82 | int ret; | ||
83 | |||
84 | - ret = pci_add_capability(pci_dev, PCI_CAP_ID_PM, offset, | ||
85 | - PCI_PM_SIZEOF, &err); | ||
86 | + ret = pci_pm_init(pci_dev, offset, &err); | ||
87 | if (err) { | ||
88 | error_report_err(err); | ||
89 | return ret; | ||
90 | diff --git a/hw/pci-bridge/pcie_pci_bridge.c b/hw/pci-bridge/pcie_pci_bridge.c | ||
91 | index XXXXXXX..XXXXXXX 100644 | ||
92 | --- a/hw/pci-bridge/pcie_pci_bridge.c | ||
93 | +++ b/hw/pci-bridge/pcie_pci_bridge.c | ||
94 | @@ -XXX,XX +XXX,XX @@ static void pcie_pci_bridge_realize(PCIDevice *d, Error **errp) | ||
95 | goto cap_error; | ||
96 | } | ||
97 | |||
98 | - pos = pci_add_capability(d, PCI_CAP_ID_PM, 0, PCI_PM_SIZEOF, errp); | ||
99 | + pos = pci_pm_init(d, 0, errp); | ||
100 | if (pos < 0) { | ||
101 | goto pm_error; | ||
102 | } | ||
103 | diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c | ||
104 | index XXXXXXX..XXXXXXX 100644 | ||
105 | --- a/hw/vfio/pci.c | ||
106 | +++ b/hw/vfio/pci.c | ||
107 | @@ -XXX,XX +XXX,XX @@ static bool vfio_add_std_cap(VFIOPCIDevice *vdev, uint8_t pos, Error **errp) | ||
108 | case PCI_CAP_ID_PM: | ||
109 | vfio_check_pm_reset(vdev, pos); | ||
110 | vdev->pm_cap = pos; | ||
111 | - ret = pci_add_capability(pdev, cap_id, pos, size, errp) >= 0; | ||
112 | + ret = pci_pm_init(pdev, pos, errp) >= 0; | ||
113 | + /* | ||
114 | + * PCI-core config space emulation needs write access to the power | ||
115 | + * state enabled for tracking BAR mapping relative to PM state. | ||
116 | + */ | ||
117 | + pci_set_word(pdev->wmask + pos + PCI_PM_CTRL, PCI_PM_CTRL_STATE_MASK); | ||
118 | break; | ||
119 | case PCI_CAP_ID_AF: | ||
120 | vfio_check_af_flr(vdev, pos); | ||
121 | diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c | ||
122 | index XXXXXXX..XXXXXXX 100644 | ||
123 | --- a/hw/virtio/virtio-pci.c | ||
124 | +++ b/hw/virtio/virtio-pci.c | ||
125 | @@ -XXX,XX +XXX,XX @@ static void virtio_pci_realize(PCIDevice *pci_dev, Error **errp) | ||
126 | pos = pcie_endpoint_cap_init(pci_dev, 0); | ||
127 | assert(pos > 0); | ||
128 | |||
129 | - pos = pci_add_capability(pci_dev, PCI_CAP_ID_PM, 0, | ||
130 | - PCI_PM_SIZEOF, errp); | ||
131 | + pos = pci_pm_init(pci_dev, 0, errp); | ||
132 | if (pos < 0) { | ||
133 | return; | ||
134 | } | ||
135 | -- | ||
136 | 2.48.1 | ||
137 | |||
138 | diff view generated by jsdifflib |
Deleted patch | |||
---|---|---|---|
1 | From: Alex Williamson <alex.williamson@redhat.com> | ||
2 | 1 | ||
3 | This is now redundant to PCIDevice.pm_cap. | ||
4 | |||
5 | Cc: Cédric Le Goater <clg@redhat.com> | ||
6 | Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> | ||
7 | Reviewed-by: Eric Auger <eric.auger@redhat.com> | ||
8 | Signed-off-by: Alex Williamson <alex.williamson@redhat.com> | ||
9 | Reviewed-by: Michael S. Tsirkin <mst@redhat.com> | ||
10 | Link: https://lore.kernel.org/qemu-devel/20250225215237.3314011-4-alex.williamson@redhat.com | ||
11 | Signed-off-by: Cédric Le Goater <clg@redhat.com> | ||
12 | --- | ||
13 | hw/vfio/pci.h | 1 - | ||
14 | hw/vfio/pci.c | 9 ++++----- | ||
15 | 2 files changed, 4 insertions(+), 6 deletions(-) | ||
16 | |||
17 | diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h | ||
18 | index XXXXXXX..XXXXXXX 100644 | ||
19 | --- a/hw/vfio/pci.h | ||
20 | +++ b/hw/vfio/pci.h | ||
21 | @@ -XXX,XX +XXX,XX @@ struct VFIOPCIDevice { | ||
22 | int32_t bootindex; | ||
23 | uint32_t igd_gms; | ||
24 | OffAutoPCIBAR msix_relo; | ||
25 | - uint8_t pm_cap; | ||
26 | uint8_t nv_gpudirect_clique; | ||
27 | bool pci_aer; | ||
28 | bool req_enabled; | ||
29 | diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c | ||
30 | index XXXXXXX..XXXXXXX 100644 | ||
31 | --- a/hw/vfio/pci.c | ||
32 | +++ b/hw/vfio/pci.c | ||
33 | @@ -XXX,XX +XXX,XX @@ static bool vfio_add_std_cap(VFIOPCIDevice *vdev, uint8_t pos, Error **errp) | ||
34 | break; | ||
35 | case PCI_CAP_ID_PM: | ||
36 | vfio_check_pm_reset(vdev, pos); | ||
37 | - vdev->pm_cap = pos; | ||
38 | ret = pci_pm_init(pdev, pos, errp) >= 0; | ||
39 | /* | ||
40 | * PCI-core config space emulation needs write access to the power | ||
41 | @@ -XXX,XX +XXX,XX @@ void vfio_pci_pre_reset(VFIOPCIDevice *vdev) | ||
42 | vfio_disable_interrupts(vdev); | ||
43 | |||
44 | /* Make sure the device is in D0 */ | ||
45 | - if (vdev->pm_cap) { | ||
46 | + if (pdev->pm_cap) { | ||
47 | uint16_t pmcsr; | ||
48 | uint8_t state; | ||
49 | |||
50 | - pmcsr = vfio_pci_read_config(pdev, vdev->pm_cap + PCI_PM_CTRL, 2); | ||
51 | + pmcsr = vfio_pci_read_config(pdev, pdev->pm_cap + PCI_PM_CTRL, 2); | ||
52 | state = pmcsr & PCI_PM_CTRL_STATE_MASK; | ||
53 | if (state) { | ||
54 | pmcsr &= ~PCI_PM_CTRL_STATE_MASK; | ||
55 | - vfio_pci_write_config(pdev, vdev->pm_cap + PCI_PM_CTRL, pmcsr, 2); | ||
56 | + vfio_pci_write_config(pdev, pdev->pm_cap + PCI_PM_CTRL, pmcsr, 2); | ||
57 | /* vfio handles the necessary delay here */ | ||
58 | - pmcsr = vfio_pci_read_config(pdev, vdev->pm_cap + PCI_PM_CTRL, 2); | ||
59 | + pmcsr = vfio_pci_read_config(pdev, pdev->pm_cap + PCI_PM_CTRL, 2); | ||
60 | state = pmcsr & PCI_PM_CTRL_STATE_MASK; | ||
61 | if (state) { | ||
62 | error_report("vfio: Unable to power on device, stuck in D%d", | ||
63 | -- | ||
64 | 2.48.1 | ||
65 | |||
66 | diff view generated by jsdifflib |
Deleted patch | |||
---|---|---|---|
1 | From: Alex Williamson <alex.williamson@redhat.com> | ||
2 | 1 | ||
3 | The pm_cap on the PCIExpressDevice object can be distilled down | ||
4 | to the new instance on the PCIDevice object. | ||
5 | |||
6 | Cc: Michael S. Tsirkin <mst@redhat.com> | ||
7 | Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> | ||
8 | Reviewed-by: Michael S. Tsirkin <mst@redhat.com> | ||
9 | Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> | ||
10 | Reviewed-by: Eric Auger <eric.auger@redhat.com> | ||
11 | Signed-off-by: Alex Williamson <alex.williamson@redhat.com> | ||
12 | Link: https://lore.kernel.org/qemu-devel/20250225215237.3314011-5-alex.williamson@redhat.com | ||
13 | Signed-off-by: Cédric Le Goater <clg@redhat.com> | ||
14 | --- | ||
15 | include/hw/pci/pcie.h | 2 -- | ||
16 | hw/pci-bridge/pcie_pci_bridge.c | 1 - | ||
17 | hw/virtio/virtio-pci.c | 8 +++----- | ||
18 | 3 files changed, 3 insertions(+), 8 deletions(-) | ||
19 | |||
20 | diff --git a/include/hw/pci/pcie.h b/include/hw/pci/pcie.h | ||
21 | index XXXXXXX..XXXXXXX 100644 | ||
22 | --- a/include/hw/pci/pcie.h | ||
23 | +++ b/include/hw/pci/pcie.h | ||
24 | @@ -XXX,XX +XXX,XX @@ typedef enum { | ||
25 | struct PCIExpressDevice { | ||
26 | /* Offset of express capability in config space */ | ||
27 | uint8_t exp_cap; | ||
28 | - /* Offset of Power Management capability in config space */ | ||
29 | - uint8_t pm_cap; | ||
30 | |||
31 | /* SLOT */ | ||
32 | bool hpev_notified; /* Logical AND of conditions for hot plug event. | ||
33 | diff --git a/hw/pci-bridge/pcie_pci_bridge.c b/hw/pci-bridge/pcie_pci_bridge.c | ||
34 | index XXXXXXX..XXXXXXX 100644 | ||
35 | --- a/hw/pci-bridge/pcie_pci_bridge.c | ||
36 | +++ b/hw/pci-bridge/pcie_pci_bridge.c | ||
37 | @@ -XXX,XX +XXX,XX @@ static void pcie_pci_bridge_realize(PCIDevice *d, Error **errp) | ||
38 | if (pos < 0) { | ||
39 | goto pm_error; | ||
40 | } | ||
41 | - d->exp.pm_cap = pos; | ||
42 | pci_set_word(d->config + pos + PCI_PM_PMC, 0x3); | ||
43 | |||
44 | pcie_cap_arifwd_init(d); | ||
45 | diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c | ||
46 | index XXXXXXX..XXXXXXX 100644 | ||
47 | --- a/hw/virtio/virtio-pci.c | ||
48 | +++ b/hw/virtio/virtio-pci.c | ||
49 | @@ -XXX,XX +XXX,XX @@ static void virtio_pci_realize(PCIDevice *pci_dev, Error **errp) | ||
50 | return; | ||
51 | } | ||
52 | |||
53 | - pci_dev->exp.pm_cap = pos; | ||
54 | - | ||
55 | /* | ||
56 | * Indicates that this function complies with revision 1.2 of the | ||
57 | * PCI Power Management Interface Specification. | ||
58 | @@ -XXX,XX +XXX,XX @@ static bool virtio_pci_no_soft_reset(PCIDevice *dev) | ||
59 | { | ||
60 | uint16_t pmcsr; | ||
61 | |||
62 | - if (!pci_is_express(dev) || !dev->exp.pm_cap) { | ||
63 | + if (!pci_is_express(dev) || !(dev->cap_present & QEMU_PCI_CAP_PM)) { | ||
64 | return false; | ||
65 | } | ||
66 | |||
67 | - pmcsr = pci_get_word(dev->config + dev->exp.pm_cap + PCI_PM_CTRL); | ||
68 | + pmcsr = pci_get_word(dev->config + dev->pm_cap + PCI_PM_CTRL); | ||
69 | |||
70 | /* | ||
71 | * When No_Soft_Reset bit is set and the device | ||
72 | @@ -XXX,XX +XXX,XX @@ static void virtio_pci_bus_reset_hold(Object *obj, ResetType type) | ||
73 | |||
74 | if (proxy->flags & VIRTIO_PCI_FLAG_INIT_PM) { | ||
75 | pci_word_test_and_clear_mask( | ||
76 | - dev->config + dev->exp.pm_cap + PCI_PM_CTRL, | ||
77 | + dev->config + dev->pm_cap + PCI_PM_CTRL, | ||
78 | PCI_PM_CTRL_STATE_MASK); | ||
79 | } | ||
80 | } | ||
81 | -- | ||
82 | 2.48.1 | ||
83 | |||
84 | diff view generated by jsdifflib |
Deleted patch | |||
---|---|---|---|
1 | From: Alex Williamson <alex.williamson@redhat.com> | ||
2 | 1 | ||
3 | We want the device in the D0 power state going into reset, but the | ||
4 | config write can enable the BARs in the address space, which are | ||
5 | then removed from the address space once we clear the memory enable | ||
6 | bit in the command register. Re-order to clear the command bit | ||
7 | first, so the power state change doesn't enable the BARs. | ||
8 | |||
9 | Cc: Cédric Le Goater <clg@redhat.com> | ||
10 | Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com> | ||
11 | Reviewed-by: Eric Auger <eric.auger@redhat.com> | ||
12 | Signed-off-by: Alex Williamson <alex.williamson@redhat.com> | ||
13 | Reviewed-by: Michael S. Tsirkin <mst@redhat.com> | ||
14 | Link: https://lore.kernel.org/qemu-devel/20250225215237.3314011-6-alex.williamson@redhat.com | ||
15 | Signed-off-by: Cédric Le Goater <clg@redhat.com> | ||
16 | --- | ||
17 | hw/vfio/pci.c | 18 +++++++++--------- | ||
18 | 1 file changed, 9 insertions(+), 9 deletions(-) | ||
19 | |||
20 | diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c | ||
21 | index XXXXXXX..XXXXXXX 100644 | ||
22 | --- a/hw/vfio/pci.c | ||
23 | +++ b/hw/vfio/pci.c | ||
24 | @@ -XXX,XX +XXX,XX @@ void vfio_pci_pre_reset(VFIOPCIDevice *vdev) | ||
25 | |||
26 | vfio_disable_interrupts(vdev); | ||
27 | |||
28 | + /* | ||
29 | + * Stop any ongoing DMA by disconnecting I/O, MMIO, and bus master. | ||
30 | + * Also put INTx Disable in known state. | ||
31 | + */ | ||
32 | + cmd = vfio_pci_read_config(pdev, PCI_COMMAND, 2); | ||
33 | + cmd &= ~(PCI_COMMAND_IO | PCI_COMMAND_MEMORY | PCI_COMMAND_MASTER | | ||
34 | + PCI_COMMAND_INTX_DISABLE); | ||
35 | + vfio_pci_write_config(pdev, PCI_COMMAND, cmd, 2); | ||
36 | + | ||
37 | /* Make sure the device is in D0 */ | ||
38 | if (pdev->pm_cap) { | ||
39 | uint16_t pmcsr; | ||
40 | @@ -XXX,XX +XXX,XX @@ void vfio_pci_pre_reset(VFIOPCIDevice *vdev) | ||
41 | } | ||
42 | } | ||
43 | } | ||
44 | - | ||
45 | - /* | ||
46 | - * Stop any ongoing DMA by disconnecting I/O, MMIO, and bus master. | ||
47 | - * Also put INTx Disable in known state. | ||
48 | - */ | ||
49 | - cmd = vfio_pci_read_config(pdev, PCI_COMMAND, 2); | ||
50 | - cmd &= ~(PCI_COMMAND_IO | PCI_COMMAND_MEMORY | PCI_COMMAND_MASTER | | ||
51 | - PCI_COMMAND_INTX_DISABLE); | ||
52 | - vfio_pci_write_config(pdev, PCI_COMMAND, cmd, 2); | ||
53 | } | ||
54 | |||
55 | void vfio_pci_post_reset(VFIOPCIDevice *vdev) | ||
56 | -- | ||
57 | 2.48.1 | ||
58 | |||
59 | diff view generated by jsdifflib |
Deleted patch | |||
---|---|---|---|
1 | From: Tomita Moeko <tomitamoeko@gmail.com> | ||
2 | 1 | ||
3 | As suggested by Cédric, I'm glad to be a maintainer of vfio-igd. | ||
4 | |||
5 | Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com> | ||
6 | Reviewed-by: Alex Williamson <alex.williamson@redhat.com> | ||
7 | Reviewed-by: Cédric Le Goater <clg@redhat.com> | ||
8 | Link: https://lore.kernel.org/qemu-devel/20250227162741.9860-1-tomitamoeko@gmail.com | ||
9 | Signed-off-by: Cédric Le Goater <clg@redhat.com> | ||
10 | --- | ||
11 | MAINTAINERS | 9 ++++++++- | ||
12 | 1 file changed, 8 insertions(+), 1 deletion(-) | ||
13 | |||
14 | diff --git a/MAINTAINERS b/MAINTAINERS | ||
15 | index XXXXXXX..XXXXXXX 100644 | ||
16 | --- a/MAINTAINERS | ||
17 | +++ b/MAINTAINERS | ||
18 | @@ -XXX,XX +XXX,XX @@ M: Cédric Le Goater <clg@redhat.com> | ||
19 | S: Supported | ||
20 | F: hw/vfio/* | ||
21 | F: include/hw/vfio/ | ||
22 | -F: docs/igd-assign.txt | ||
23 | F: docs/devel/migration/vfio.rst | ||
24 | F: qapi/vfio.json | ||
25 | |||
26 | +vfio-igd | ||
27 | +M: Alex Williamson <alex.williamson@redhat.com> | ||
28 | +M: Cédric Le Goater <clg@redhat.com> | ||
29 | +M: Tomita Moeko <tomitamoeko@gmail.com> | ||
30 | +S: Supported | ||
31 | +F: hw/vfio/igd.c | ||
32 | +F: docs/igd-assign.txt | ||
33 | + | ||
34 | vfio-ccw | ||
35 | M: Eric Farman <farman@linux.ibm.com> | ||
36 | M: Matthew Rosato <mjrosato@linux.ibm.com> | ||
37 | -- | ||
38 | 2.48.1 | ||
39 | |||
40 | diff view generated by jsdifflib |
Deleted patch | |||
---|---|---|---|
1 | From: Eric Auger <eric.auger@redhat.com> | ||
2 | 1 | ||
3 | As an outcome of KVM forum 2024 "vfio-platform: live and let die?" | ||
4 | talk, let's deprecate vfio-platform devices. | ||
5 | |||
6 | Signed-off-by: Eric Auger <eric.auger@redhat.com> | ||
7 | Reviewed-by: Cédric Le Goater <clg@redhat.com> | ||
8 | Link: https://lore.kernel.org/qemu-devel/20250305124225.952791-1-eric.auger@redhat.com | ||
9 | [ clg: Fixed spelling in vfio-amd-xgbe section ] | ||
10 | Signed-off-by: Cédric Le Goater <clg@redhat.com> | ||
11 | --- | ||
12 | docs/about/deprecated.rst | 25 +++++++++++++++++++++++++ | ||
13 | hw/vfio/amd-xgbe.c | 2 ++ | ||
14 | hw/vfio/calxeda-xgmac.c | 2 ++ | ||
15 | hw/vfio/platform.c | 1 + | ||
16 | 4 files changed, 30 insertions(+) | ||
17 | |||
18 | diff --git a/docs/about/deprecated.rst b/docs/about/deprecated.rst | ||
19 | index XXXXXXX..XXXXXXX 100644 | ||
20 | --- a/docs/about/deprecated.rst | ||
21 | +++ b/docs/about/deprecated.rst | ||
22 | @@ -XXX,XX +XXX,XX @@ Stream ``reconnect`` (since 9.2) | ||
23 | The ``reconnect`` option only allows specifiying second granularity timeouts, | ||
24 | which is not enough for all types of use cases, use ``reconnect-ms`` instead. | ||
25 | |||
26 | +VFIO device options | ||
27 | +''''''''''''''''''' | ||
28 | + | ||
29 | +``-device vfio-calxeda-xgmac`` (since 10.0) | ||
30 | +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
31 | +The vfio-calxeda-xgmac device allows to assign a host Calxeda Highbank | ||
32 | +10Gb XGMAC Ethernet controller device ("calxeda,hb-xgmac" compatibility | ||
33 | +string) to a guest. Calxeda HW has been ewasted now and there is no point | ||
34 | +keeping that device. | ||
35 | + | ||
36 | +``-device vfio-amd-xgbe`` (since 10.0) | ||
37 | +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
38 | +The vfio-amd-xgbe device allows to assign a host AMD 10GbE controller | ||
39 | +to a guest ("amd,xgbe-seattle-v1a" compatibility string). AMD "Seattle" | ||
40 | +is not supported anymore and there is no point keeping that device. | ||
41 | + | ||
42 | +``-device vfio-platform`` (since 10.0) | ||
43 | +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
44 | +The vfio-platform device allows to assign a host platform device | ||
45 | +to a guest in a generic manner. Integrating a new device into | ||
46 | +the vfio-platform infrastructure requires some adaptation at | ||
47 | +both kernel and qemu level. No such attempt has been done for years | ||
48 | +and the conclusion is that vfio-platform has not got any traction. | ||
49 | +PCIe passthrough shall be the mainline solution. | ||
50 | + | ||
51 | CPU device properties | ||
52 | ''''''''''''''''''''' | ||
53 | |||
54 | diff --git a/hw/vfio/amd-xgbe.c b/hw/vfio/amd-xgbe.c | ||
55 | index XXXXXXX..XXXXXXX 100644 | ||
56 | --- a/hw/vfio/amd-xgbe.c | ||
57 | +++ b/hw/vfio/amd-xgbe.c | ||
58 | @@ -XXX,XX +XXX,XX @@ | ||
59 | #include "hw/vfio/vfio-amd-xgbe.h" | ||
60 | #include "migration/vmstate.h" | ||
61 | #include "qemu/module.h" | ||
62 | +#include "qemu/error-report.h" | ||
63 | |||
64 | static void amd_xgbe_realize(DeviceState *dev, Error **errp) | ||
65 | { | ||
66 | VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(dev); | ||
67 | VFIOAmdXgbeDeviceClass *k = VFIO_AMD_XGBE_DEVICE_GET_CLASS(dev); | ||
68 | |||
69 | + warn_report("-device vfio-amd-xgbe is deprecated"); | ||
70 | vdev->compat = g_strdup("amd,xgbe-seattle-v1a"); | ||
71 | vdev->num_compat = 1; | ||
72 | |||
73 | diff --git a/hw/vfio/calxeda-xgmac.c b/hw/vfio/calxeda-xgmac.c | ||
74 | index XXXXXXX..XXXXXXX 100644 | ||
75 | --- a/hw/vfio/calxeda-xgmac.c | ||
76 | +++ b/hw/vfio/calxeda-xgmac.c | ||
77 | @@ -XXX,XX +XXX,XX @@ | ||
78 | #include "hw/vfio/vfio-calxeda-xgmac.h" | ||
79 | #include "migration/vmstate.h" | ||
80 | #include "qemu/module.h" | ||
81 | +#include "qemu/error-report.h" | ||
82 | |||
83 | static void calxeda_xgmac_realize(DeviceState *dev, Error **errp) | ||
84 | { | ||
85 | VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(dev); | ||
86 | VFIOCalxedaXgmacDeviceClass *k = VFIO_CALXEDA_XGMAC_DEVICE_GET_CLASS(dev); | ||
87 | |||
88 | + warn_report("-device vfio-calxeda-xgmac is deprecated"); | ||
89 | vdev->compat = g_strdup("calxeda,hb-xgmac"); | ||
90 | vdev->num_compat = 1; | ||
91 | |||
92 | diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c | ||
93 | index XXXXXXX..XXXXXXX 100644 | ||
94 | --- a/hw/vfio/platform.c | ||
95 | +++ b/hw/vfio/platform.c | ||
96 | @@ -XXX,XX +XXX,XX @@ static void vfio_platform_realize(DeviceState *dev, Error **errp) | ||
97 | VFIODevice *vbasedev = &vdev->vbasedev; | ||
98 | int i; | ||
99 | |||
100 | + warn_report("-device vfio-platform is deprecated"); | ||
101 | qemu_mutex_init(&vdev->intp_mutex); | ||
102 | |||
103 | trace_vfio_platform_realize(vbasedev->sysfsdev ? | ||
104 | -- | ||
105 | 2.48.1 | ||
106 | |||
107 | diff view generated by jsdifflib |
Deleted patch | |||
---|---|---|---|
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | ||
2 | 1 | ||
3 | It's possible for {load,save}_cleanup SaveVMHandlers to get called without | ||
4 | the corresponding {load,save}_setup handler being called first. | ||
5 | |||
6 | One such example is if {load,save}_setup handler of a proceeding device | ||
7 | returns error. | ||
8 | In this case the migration core cleanup code will call all corresponding | ||
9 | cleanup handlers, even for these devices which haven't had its setup | ||
10 | handler called. | ||
11 | |||
12 | Since this behavior can generate some surprises let's clearly document it | ||
13 | in these SaveVMHandlers description. | ||
14 | |||
15 | Reviewed-by: Fabiano Rosas <farosas@suse.de> | ||
16 | Reviewed-by: Cédric Le Goater <clg@redhat.com> | ||
17 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | ||
18 | Link: https://lore.kernel.org/qemu-devel/991636623fb780350f493b5f045cb17e13ce4c0f.1741124640.git.maciej.szmigiero@oracle.com | ||
19 | Signed-off-by: Cédric Le Goater <clg@redhat.com> | ||
20 | --- | ||
21 | include/migration/register.h | 6 +++++- | ||
22 | 1 file changed, 5 insertions(+), 1 deletion(-) | ||
23 | |||
24 | diff --git a/include/migration/register.h b/include/migration/register.h | ||
25 | index XXXXXXX..XXXXXXX 100644 | ||
26 | --- a/include/migration/register.h | ||
27 | +++ b/include/migration/register.h | ||
28 | @@ -XXX,XX +XXX,XX @@ typedef struct SaveVMHandlers { | ||
29 | /** | ||
30 | * @save_cleanup | ||
31 | * | ||
32 | - * Uninitializes the data structures on the source | ||
33 | + * Uninitializes the data structures on the source. | ||
34 | + * Note that this handler can be called even if save_setup | ||
35 | + * wasn't called earlier. | ||
36 | * | ||
37 | * @opaque: data pointer passed to register_savevm_live() | ||
38 | */ | ||
39 | @@ -XXX,XX +XXX,XX @@ typedef struct SaveVMHandlers { | ||
40 | * @load_cleanup | ||
41 | * | ||
42 | * Uninitializes the data structures on the destination. | ||
43 | + * Note that this handler can be called even if load_setup | ||
44 | + * wasn't called earlier. | ||
45 | * | ||
46 | * @opaque: data pointer passed to register_savevm_live() | ||
47 | * | ||
48 | -- | ||
49 | 2.48.1 | ||
50 | |||
51 | diff view generated by jsdifflib |
Deleted patch | |||
---|---|---|---|
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | ||
2 | 1 | ||
3 | This function name conflicts with one used by a future generic thread pool | ||
4 | function and it was only used by one test anyway. | ||
5 | |||
6 | Update the trace event name in thread_pool_submit_aio() accordingly. | ||
7 | |||
8 | Acked-by: Fabiano Rosas <farosas@suse.de> | ||
9 | Reviewed-by: Cédric Le Goater <clg@redhat.com> | ||
10 | Reviewed-by: Peter Xu <peterx@redhat.com> | ||
11 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | ||
12 | Link: https://lore.kernel.org/qemu-devel/6830f07777f939edaf0a2d301c39adcaaf3817f0.1741124640.git.maciej.szmigiero@oracle.com | ||
13 | Signed-off-by: Cédric Le Goater <clg@redhat.com> | ||
14 | --- | ||
15 | include/block/thread-pool.h | 3 +-- | ||
16 | tests/unit/test-thread-pool.c | 6 +++--- | ||
17 | util/thread-pool.c | 7 +------ | ||
18 | util/trace-events | 2 +- | ||
19 | 4 files changed, 6 insertions(+), 12 deletions(-) | ||
20 | |||
21 | diff --git a/include/block/thread-pool.h b/include/block/thread-pool.h | ||
22 | index XXXXXXX..XXXXXXX 100644 | ||
23 | --- a/include/block/thread-pool.h | ||
24 | +++ b/include/block/thread-pool.h | ||
25 | @@ -XXX,XX +XXX,XX @@ ThreadPool *thread_pool_new(struct AioContext *ctx); | ||
26 | void thread_pool_free(ThreadPool *pool); | ||
27 | |||
28 | /* | ||
29 | - * thread_pool_submit* API: submit I/O requests in the thread's | ||
30 | + * thread_pool_submit_{aio,co} API: submit I/O requests in the thread's | ||
31 | * current AioContext. | ||
32 | */ | ||
33 | BlockAIOCB *thread_pool_submit_aio(ThreadPoolFunc *func, void *arg, | ||
34 | BlockCompletionFunc *cb, void *opaque); | ||
35 | int coroutine_fn thread_pool_submit_co(ThreadPoolFunc *func, void *arg); | ||
36 | -void thread_pool_submit(ThreadPoolFunc *func, void *arg); | ||
37 | |||
38 | void thread_pool_update_params(ThreadPool *pool, struct AioContext *ctx); | ||
39 | |||
40 | diff --git a/tests/unit/test-thread-pool.c b/tests/unit/test-thread-pool.c | ||
41 | index XXXXXXX..XXXXXXX 100644 | ||
42 | --- a/tests/unit/test-thread-pool.c | ||
43 | +++ b/tests/unit/test-thread-pool.c | ||
44 | @@ -XXX,XX +XXX,XX @@ static void done_cb(void *opaque, int ret) | ||
45 | active--; | ||
46 | } | ||
47 | |||
48 | -static void test_submit(void) | ||
49 | +static void test_submit_no_complete(void) | ||
50 | { | ||
51 | WorkerTestData data = { .n = 0 }; | ||
52 | - thread_pool_submit(worker_cb, &data); | ||
53 | + thread_pool_submit_aio(worker_cb, &data, NULL, NULL); | ||
54 | while (data.n == 0) { | ||
55 | aio_poll(ctx, true); | ||
56 | } | ||
57 | @@ -XXX,XX +XXX,XX @@ int main(int argc, char **argv) | ||
58 | ctx = qemu_get_current_aio_context(); | ||
59 | |||
60 | g_test_init(&argc, &argv, NULL); | ||
61 | - g_test_add_func("/thread-pool/submit", test_submit); | ||
62 | + g_test_add_func("/thread-pool/submit-no-complete", test_submit_no_complete); | ||
63 | g_test_add_func("/thread-pool/submit-aio", test_submit_aio); | ||
64 | g_test_add_func("/thread-pool/submit-co", test_submit_co); | ||
65 | g_test_add_func("/thread-pool/submit-many", test_submit_many); | ||
66 | diff --git a/util/thread-pool.c b/util/thread-pool.c | ||
67 | index XXXXXXX..XXXXXXX 100644 | ||
68 | --- a/util/thread-pool.c | ||
69 | +++ b/util/thread-pool.c | ||
70 | @@ -XXX,XX +XXX,XX @@ BlockAIOCB *thread_pool_submit_aio(ThreadPoolFunc *func, void *arg, | ||
71 | |||
72 | QLIST_INSERT_HEAD(&pool->head, req, all); | ||
73 | |||
74 | - trace_thread_pool_submit(pool, req, arg); | ||
75 | + trace_thread_pool_submit_aio(pool, req, arg); | ||
76 | |||
77 | qemu_mutex_lock(&pool->lock); | ||
78 | if (pool->idle_threads == 0 && pool->cur_threads < pool->max_threads) { | ||
79 | @@ -XXX,XX +XXX,XX @@ int coroutine_fn thread_pool_submit_co(ThreadPoolFunc *func, void *arg) | ||
80 | return tpc.ret; | ||
81 | } | ||
82 | |||
83 | -void thread_pool_submit(ThreadPoolFunc *func, void *arg) | ||
84 | -{ | ||
85 | - thread_pool_submit_aio(func, arg, NULL, NULL); | ||
86 | -} | ||
87 | - | ||
88 | void thread_pool_update_params(ThreadPool *pool, AioContext *ctx) | ||
89 | { | ||
90 | qemu_mutex_lock(&pool->lock); | ||
91 | diff --git a/util/trace-events b/util/trace-events | ||
92 | index XXXXXXX..XXXXXXX 100644 | ||
93 | --- a/util/trace-events | ||
94 | +++ b/util/trace-events | ||
95 | @@ -XXX,XX +XXX,XX @@ aio_co_schedule_bh_cb(void *ctx, void *co) "ctx %p co %p" | ||
96 | reentrant_aio(void *ctx, const char *name) "ctx %p name %s" | ||
97 | |||
98 | # thread-pool.c | ||
99 | -thread_pool_submit(void *pool, void *req, void *opaque) "pool %p req %p opaque %p" | ||
100 | +thread_pool_submit_aio(void *pool, void *req, void *opaque) "pool %p req %p opaque %p" | ||
101 | thread_pool_complete(void *pool, void *req, void *opaque, int ret) "pool %p req %p opaque %p ret %d" | ||
102 | thread_pool_cancel(void *req, void *opaque) "req %p opaque %p" | ||
103 | |||
104 | -- | ||
105 | 2.48.1 | ||
106 | |||
107 | diff view generated by jsdifflib |
Deleted patch | |||
---|---|---|---|
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | ||
2 | 1 | ||
3 | These names conflict with ones used by future generic thread pool | ||
4 | equivalents. | ||
5 | Generic names should belong to the generic pool type, not specific (AIO) | ||
6 | type. | ||
7 | |||
8 | Acked-by: Fabiano Rosas <farosas@suse.de> | ||
9 | Reviewed-by: Cédric Le Goater <clg@redhat.com> | ||
10 | Reviewed-by: Peter Xu <peterx@redhat.com> | ||
11 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | ||
12 | Link: https://lore.kernel.org/qemu-devel/70f9e0fb4b01042258a1a57996c64d19779dc7f0.1741124640.git.maciej.szmigiero@oracle.com | ||
13 | Signed-off-by: Cédric Le Goater <clg@redhat.com> | ||
14 | --- | ||
15 | include/block/aio.h | 8 ++--- | ||
16 | include/block/thread-pool.h | 8 ++--- | ||
17 | util/async.c | 6 ++-- | ||
18 | util/thread-pool.c | 58 ++++++++++++++++++------------------- | ||
19 | util/trace-events | 4 +-- | ||
20 | 5 files changed, 42 insertions(+), 42 deletions(-) | ||
21 | |||
22 | diff --git a/include/block/aio.h b/include/block/aio.h | ||
23 | index XXXXXXX..XXXXXXX 100644 | ||
24 | --- a/include/block/aio.h | ||
25 | +++ b/include/block/aio.h | ||
26 | @@ -XXX,XX +XXX,XX @@ typedef void QEMUBHFunc(void *opaque); | ||
27 | typedef bool AioPollFn(void *opaque); | ||
28 | typedef void IOHandler(void *opaque); | ||
29 | |||
30 | -struct ThreadPool; | ||
31 | +struct ThreadPoolAio; | ||
32 | struct LinuxAioState; | ||
33 | typedef struct LuringState LuringState; | ||
34 | |||
35 | @@ -XXX,XX +XXX,XX @@ struct AioContext { | ||
36 | /* Thread pool for performing work and receiving completion callbacks. | ||
37 | * Has its own locking. | ||
38 | */ | ||
39 | - struct ThreadPool *thread_pool; | ||
40 | + struct ThreadPoolAio *thread_pool; | ||
41 | |||
42 | #ifdef CONFIG_LINUX_AIO | ||
43 | struct LinuxAioState *linux_aio; | ||
44 | @@ -XXX,XX +XXX,XX @@ void aio_set_event_notifier_poll(AioContext *ctx, | ||
45 | */ | ||
46 | GSource *aio_get_g_source(AioContext *ctx); | ||
47 | |||
48 | -/* Return the ThreadPool bound to this AioContext */ | ||
49 | -struct ThreadPool *aio_get_thread_pool(AioContext *ctx); | ||
50 | +/* Return the ThreadPoolAio bound to this AioContext */ | ||
51 | +struct ThreadPoolAio *aio_get_thread_pool(AioContext *ctx); | ||
52 | |||
53 | /* Setup the LinuxAioState bound to this AioContext */ | ||
54 | struct LinuxAioState *aio_setup_linux_aio(AioContext *ctx, Error **errp); | ||
55 | diff --git a/include/block/thread-pool.h b/include/block/thread-pool.h | ||
56 | index XXXXXXX..XXXXXXX 100644 | ||
57 | --- a/include/block/thread-pool.h | ||
58 | +++ b/include/block/thread-pool.h | ||
59 | @@ -XXX,XX +XXX,XX @@ | ||
60 | |||
61 | typedef int ThreadPoolFunc(void *opaque); | ||
62 | |||
63 | -typedef struct ThreadPool ThreadPool; | ||
64 | +typedef struct ThreadPoolAio ThreadPoolAio; | ||
65 | |||
66 | -ThreadPool *thread_pool_new(struct AioContext *ctx); | ||
67 | -void thread_pool_free(ThreadPool *pool); | ||
68 | +ThreadPoolAio *thread_pool_new_aio(struct AioContext *ctx); | ||
69 | +void thread_pool_free_aio(ThreadPoolAio *pool); | ||
70 | |||
71 | /* | ||
72 | * thread_pool_submit_{aio,co} API: submit I/O requests in the thread's | ||
73 | @@ -XXX,XX +XXX,XX @@ void thread_pool_free(ThreadPool *pool); | ||
74 | BlockAIOCB *thread_pool_submit_aio(ThreadPoolFunc *func, void *arg, | ||
75 | BlockCompletionFunc *cb, void *opaque); | ||
76 | int coroutine_fn thread_pool_submit_co(ThreadPoolFunc *func, void *arg); | ||
77 | +void thread_pool_update_params(ThreadPoolAio *pool, struct AioContext *ctx); | ||
78 | |||
79 | -void thread_pool_update_params(ThreadPool *pool, struct AioContext *ctx); | ||
80 | |||
81 | #endif | ||
82 | diff --git a/util/async.c b/util/async.c | ||
83 | index XXXXXXX..XXXXXXX 100644 | ||
84 | --- a/util/async.c | ||
85 | +++ b/util/async.c | ||
86 | @@ -XXX,XX +XXX,XX @@ aio_ctx_finalize(GSource *source) | ||
87 | QEMUBH *bh; | ||
88 | unsigned flags; | ||
89 | |||
90 | - thread_pool_free(ctx->thread_pool); | ||
91 | + thread_pool_free_aio(ctx->thread_pool); | ||
92 | |||
93 | #ifdef CONFIG_LINUX_AIO | ||
94 | if (ctx->linux_aio) { | ||
95 | @@ -XXX,XX +XXX,XX @@ GSource *aio_get_g_source(AioContext *ctx) | ||
96 | return &ctx->source; | ||
97 | } | ||
98 | |||
99 | -ThreadPool *aio_get_thread_pool(AioContext *ctx) | ||
100 | +ThreadPoolAio *aio_get_thread_pool(AioContext *ctx) | ||
101 | { | ||
102 | if (!ctx->thread_pool) { | ||
103 | - ctx->thread_pool = thread_pool_new(ctx); | ||
104 | + ctx->thread_pool = thread_pool_new_aio(ctx); | ||
105 | } | ||
106 | return ctx->thread_pool; | ||
107 | } | ||
108 | diff --git a/util/thread-pool.c b/util/thread-pool.c | ||
109 | index XXXXXXX..XXXXXXX 100644 | ||
110 | --- a/util/thread-pool.c | ||
111 | +++ b/util/thread-pool.c | ||
112 | @@ -XXX,XX +XXX,XX @@ | ||
113 | #include "block/thread-pool.h" | ||
114 | #include "qemu/main-loop.h" | ||
115 | |||
116 | -static void do_spawn_thread(ThreadPool *pool); | ||
117 | +static void do_spawn_thread(ThreadPoolAio *pool); | ||
118 | |||
119 | -typedef struct ThreadPoolElement ThreadPoolElement; | ||
120 | +typedef struct ThreadPoolElementAio ThreadPoolElementAio; | ||
121 | |||
122 | enum ThreadState { | ||
123 | THREAD_QUEUED, | ||
124 | @@ -XXX,XX +XXX,XX @@ enum ThreadState { | ||
125 | THREAD_DONE, | ||
126 | }; | ||
127 | |||
128 | -struct ThreadPoolElement { | ||
129 | +struct ThreadPoolElementAio { | ||
130 | BlockAIOCB common; | ||
131 | - ThreadPool *pool; | ||
132 | + ThreadPoolAio *pool; | ||
133 | ThreadPoolFunc *func; | ||
134 | void *arg; | ||
135 | |||
136 | @@ -XXX,XX +XXX,XX @@ struct ThreadPoolElement { | ||
137 | int ret; | ||
138 | |||
139 | /* Access to this list is protected by lock. */ | ||
140 | - QTAILQ_ENTRY(ThreadPoolElement) reqs; | ||
141 | + QTAILQ_ENTRY(ThreadPoolElementAio) reqs; | ||
142 | |||
143 | /* This list is only written by the thread pool's mother thread. */ | ||
144 | - QLIST_ENTRY(ThreadPoolElement) all; | ||
145 | + QLIST_ENTRY(ThreadPoolElementAio) all; | ||
146 | }; | ||
147 | |||
148 | -struct ThreadPool { | ||
149 | +struct ThreadPoolAio { | ||
150 | AioContext *ctx; | ||
151 | QEMUBH *completion_bh; | ||
152 | QemuMutex lock; | ||
153 | @@ -XXX,XX +XXX,XX @@ struct ThreadPool { | ||
154 | QEMUBH *new_thread_bh; | ||
155 | |||
156 | /* The following variables are only accessed from one AioContext. */ | ||
157 | - QLIST_HEAD(, ThreadPoolElement) head; | ||
158 | + QLIST_HEAD(, ThreadPoolElementAio) head; | ||
159 | |||
160 | /* The following variables are protected by lock. */ | ||
161 | - QTAILQ_HEAD(, ThreadPoolElement) request_list; | ||
162 | + QTAILQ_HEAD(, ThreadPoolElementAio) request_list; | ||
163 | int cur_threads; | ||
164 | int idle_threads; | ||
165 | int new_threads; /* backlog of threads we need to create */ | ||
166 | @@ -XXX,XX +XXX,XX @@ struct ThreadPool { | ||
167 | |||
168 | static void *worker_thread(void *opaque) | ||
169 | { | ||
170 | - ThreadPool *pool = opaque; | ||
171 | + ThreadPoolAio *pool = opaque; | ||
172 | |||
173 | qemu_mutex_lock(&pool->lock); | ||
174 | pool->pending_threads--; | ||
175 | do_spawn_thread(pool); | ||
176 | |||
177 | while (pool->cur_threads <= pool->max_threads) { | ||
178 | - ThreadPoolElement *req; | ||
179 | + ThreadPoolElementAio *req; | ||
180 | int ret; | ||
181 | |||
182 | if (QTAILQ_EMPTY(&pool->request_list)) { | ||
183 | @@ -XXX,XX +XXX,XX @@ static void *worker_thread(void *opaque) | ||
184 | return NULL; | ||
185 | } | ||
186 | |||
187 | -static void do_spawn_thread(ThreadPool *pool) | ||
188 | +static void do_spawn_thread(ThreadPoolAio *pool) | ||
189 | { | ||
190 | QemuThread t; | ||
191 | |||
192 | @@ -XXX,XX +XXX,XX @@ static void do_spawn_thread(ThreadPool *pool) | ||
193 | |||
194 | static void spawn_thread_bh_fn(void *opaque) | ||
195 | { | ||
196 | - ThreadPool *pool = opaque; | ||
197 | + ThreadPoolAio *pool = opaque; | ||
198 | |||
199 | qemu_mutex_lock(&pool->lock); | ||
200 | do_spawn_thread(pool); | ||
201 | qemu_mutex_unlock(&pool->lock); | ||
202 | } | ||
203 | |||
204 | -static void spawn_thread(ThreadPool *pool) | ||
205 | +static void spawn_thread(ThreadPoolAio *pool) | ||
206 | { | ||
207 | pool->cur_threads++; | ||
208 | pool->new_threads++; | ||
209 | @@ -XXX,XX +XXX,XX @@ static void spawn_thread(ThreadPool *pool) | ||
210 | |||
211 | static void thread_pool_completion_bh(void *opaque) | ||
212 | { | ||
213 | - ThreadPool *pool = opaque; | ||
214 | - ThreadPoolElement *elem, *next; | ||
215 | + ThreadPoolAio *pool = opaque; | ||
216 | + ThreadPoolElementAio *elem, *next; | ||
217 | |||
218 | defer_call_begin(); /* cb() may use defer_call() to coalesce work */ | ||
219 | |||
220 | @@ -XXX,XX +XXX,XX @@ restart: | ||
221 | continue; | ||
222 | } | ||
223 | |||
224 | - trace_thread_pool_complete(pool, elem, elem->common.opaque, | ||
225 | - elem->ret); | ||
226 | + trace_thread_pool_complete_aio(pool, elem, elem->common.opaque, | ||
227 | + elem->ret); | ||
228 | QLIST_REMOVE(elem, all); | ||
229 | |||
230 | if (elem->common.cb) { | ||
231 | @@ -XXX,XX +XXX,XX @@ restart: | ||
232 | |||
233 | static void thread_pool_cancel(BlockAIOCB *acb) | ||
234 | { | ||
235 | - ThreadPoolElement *elem = (ThreadPoolElement *)acb; | ||
236 | - ThreadPool *pool = elem->pool; | ||
237 | + ThreadPoolElementAio *elem = (ThreadPoolElementAio *)acb; | ||
238 | + ThreadPoolAio *pool = elem->pool; | ||
239 | |||
240 | - trace_thread_pool_cancel(elem, elem->common.opaque); | ||
241 | + trace_thread_pool_cancel_aio(elem, elem->common.opaque); | ||
242 | |||
243 | QEMU_LOCK_GUARD(&pool->lock); | ||
244 | if (elem->state == THREAD_QUEUED) { | ||
245 | @@ -XXX,XX +XXX,XX @@ static void thread_pool_cancel(BlockAIOCB *acb) | ||
246 | } | ||
247 | |||
248 | static const AIOCBInfo thread_pool_aiocb_info = { | ||
249 | - .aiocb_size = sizeof(ThreadPoolElement), | ||
250 | + .aiocb_size = sizeof(ThreadPoolElementAio), | ||
251 | .cancel_async = thread_pool_cancel, | ||
252 | }; | ||
253 | |||
254 | BlockAIOCB *thread_pool_submit_aio(ThreadPoolFunc *func, void *arg, | ||
255 | BlockCompletionFunc *cb, void *opaque) | ||
256 | { | ||
257 | - ThreadPoolElement *req; | ||
258 | + ThreadPoolElementAio *req; | ||
259 | AioContext *ctx = qemu_get_current_aio_context(); | ||
260 | - ThreadPool *pool = aio_get_thread_pool(ctx); | ||
261 | + ThreadPoolAio *pool = aio_get_thread_pool(ctx); | ||
262 | |||
263 | /* Assert that the thread submitting work is the same running the pool */ | ||
264 | assert(pool->ctx == qemu_get_current_aio_context()); | ||
265 | @@ -XXX,XX +XXX,XX @@ int coroutine_fn thread_pool_submit_co(ThreadPoolFunc *func, void *arg) | ||
266 | return tpc.ret; | ||
267 | } | ||
268 | |||
269 | -void thread_pool_update_params(ThreadPool *pool, AioContext *ctx) | ||
270 | +void thread_pool_update_params(ThreadPoolAio *pool, AioContext *ctx) | ||
271 | { | ||
272 | qemu_mutex_lock(&pool->lock); | ||
273 | |||
274 | @@ -XXX,XX +XXX,XX @@ void thread_pool_update_params(ThreadPool *pool, AioContext *ctx) | ||
275 | qemu_mutex_unlock(&pool->lock); | ||
276 | } | ||
277 | |||
278 | -static void thread_pool_init_one(ThreadPool *pool, AioContext *ctx) | ||
279 | +static void thread_pool_init_one(ThreadPoolAio *pool, AioContext *ctx) | ||
280 | { | ||
281 | if (!ctx) { | ||
282 | ctx = qemu_get_aio_context(); | ||
283 | @@ -XXX,XX +XXX,XX @@ static void thread_pool_init_one(ThreadPool *pool, AioContext *ctx) | ||
284 | thread_pool_update_params(pool, ctx); | ||
285 | } | ||
286 | |||
287 | -ThreadPool *thread_pool_new(AioContext *ctx) | ||
288 | +ThreadPoolAio *thread_pool_new_aio(AioContext *ctx) | ||
289 | { | ||
290 | - ThreadPool *pool = g_new(ThreadPool, 1); | ||
291 | + ThreadPoolAio *pool = g_new(ThreadPoolAio, 1); | ||
292 | thread_pool_init_one(pool, ctx); | ||
293 | return pool; | ||
294 | } | ||
295 | |||
296 | -void thread_pool_free(ThreadPool *pool) | ||
297 | +void thread_pool_free_aio(ThreadPoolAio *pool) | ||
298 | { | ||
299 | if (!pool) { | ||
300 | return; | ||
301 | diff --git a/util/trace-events b/util/trace-events | ||
302 | index XXXXXXX..XXXXXXX 100644 | ||
303 | --- a/util/trace-events | ||
304 | +++ b/util/trace-events | ||
305 | @@ -XXX,XX +XXX,XX @@ reentrant_aio(void *ctx, const char *name) "ctx %p name %s" | ||
306 | |||
307 | # thread-pool.c | ||
308 | thread_pool_submit_aio(void *pool, void *req, void *opaque) "pool %p req %p opaque %p" | ||
309 | -thread_pool_complete(void *pool, void *req, void *opaque, int ret) "pool %p req %p opaque %p ret %d" | ||
310 | -thread_pool_cancel(void *req, void *opaque) "req %p opaque %p" | ||
311 | +thread_pool_complete_aio(void *pool, void *req, void *opaque, int ret) "pool %p req %p opaque %p ret %d" | ||
312 | +thread_pool_cancel_aio(void *req, void *opaque) "req %p opaque %p" | ||
313 | |||
314 | # buffer.c | ||
315 | buffer_resize(const char *buf, size_t olen, size_t len) "%s: old %zd, new %zd" | ||
316 | -- | ||
317 | 2.48.1 | ||
318 | |||
319 | diff view generated by jsdifflib |
Deleted patch | |||
---|---|---|---|
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | ||
2 | 1 | ||
3 | Migration code wants to manage device data sending threads in one place. | ||
4 | |||
5 | QEMU has an existing thread pool implementation, however it is limited | ||
6 | to queuing AIO operations only and essentially has a 1:1 mapping between | ||
7 | the current AioContext and the AIO ThreadPool in use. | ||
8 | |||
9 | Implement generic (non-AIO) ThreadPool by essentially wrapping Glib's | ||
10 | GThreadPool. | ||
11 | |||
12 | This brings a few new operations on a pool: | ||
13 | * thread_pool_wait() operation waits until all the submitted work requests | ||
14 | have finished. | ||
15 | |||
16 | * thread_pool_set_max_threads() explicitly sets the maximum thread count | ||
17 | in the pool. | ||
18 | |||
19 | * thread_pool_adjust_max_threads_to_work() adjusts the maximum thread count | ||
20 | in the pool to equal the number of still waiting in queue or unfinished work. | ||
21 | |||
22 | Reviewed-by: Fabiano Rosas <farosas@suse.de> | ||
23 | Reviewed-by: Peter Xu <peterx@redhat.com> | ||
24 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | ||
25 | Link: https://lore.kernel.org/qemu-devel/b1efaebdbea7cb7068b8fb74148777012383e12b.1741124640.git.maciej.szmigiero@oracle.com | ||
26 | Signed-off-by: Cédric Le Goater <clg@redhat.com> | ||
27 | --- | ||
28 | include/block/thread-pool.h | 51 ++++++++++++++++ | ||
29 | util/thread-pool.c | 119 ++++++++++++++++++++++++++++++++++++ | ||
30 | 2 files changed, 170 insertions(+) | ||
31 | |||
32 | diff --git a/include/block/thread-pool.h b/include/block/thread-pool.h | ||
33 | index XXXXXXX..XXXXXXX 100644 | ||
34 | --- a/include/block/thread-pool.h | ||
35 | +++ b/include/block/thread-pool.h | ||
36 | @@ -XXX,XX +XXX,XX @@ BlockAIOCB *thread_pool_submit_aio(ThreadPoolFunc *func, void *arg, | ||
37 | int coroutine_fn thread_pool_submit_co(ThreadPoolFunc *func, void *arg); | ||
38 | void thread_pool_update_params(ThreadPoolAio *pool, struct AioContext *ctx); | ||
39 | |||
40 | +/* ------------------------------------------- */ | ||
41 | +/* Generic thread pool types and methods below */ | ||
42 | +typedef struct ThreadPool ThreadPool; | ||
43 | + | ||
44 | +/* Create a new thread pool. Never returns NULL. */ | ||
45 | +ThreadPool *thread_pool_new(void); | ||
46 | + | ||
47 | +/* | ||
48 | + * Free the thread pool. | ||
49 | + * Waits for all the previously submitted work to complete before performing | ||
50 | + * the actual freeing operation. | ||
51 | + */ | ||
52 | +void thread_pool_free(ThreadPool *pool); | ||
53 | + | ||
54 | +/* | ||
55 | + * Submit a new work (task) for the pool. | ||
56 | + * | ||
57 | + * @opaque_destroy is an optional GDestroyNotify for the @opaque argument | ||
58 | + * to the work function at @func. | ||
59 | + */ | ||
60 | +void thread_pool_submit(ThreadPool *pool, ThreadPoolFunc *func, | ||
61 | + void *opaque, GDestroyNotify opaque_destroy); | ||
62 | + | ||
63 | +/* | ||
64 | + * Submit a new work (task) for the pool, making sure it starts getting | ||
65 | + * processed immediately, launching a new thread for it if necessary. | ||
66 | + * | ||
67 | + * @opaque_destroy is an optional GDestroyNotify for the @opaque argument | ||
68 | + * to the work function at @func. | ||
69 | + */ | ||
70 | +void thread_pool_submit_immediate(ThreadPool *pool, ThreadPoolFunc *func, | ||
71 | + void *opaque, GDestroyNotify opaque_destroy); | ||
72 | + | ||
73 | +/* | ||
74 | + * Wait for all previously submitted work to complete before returning. | ||
75 | + * | ||
76 | + * Can be used as a barrier between two sets of tasks executed on a thread | ||
77 | + * pool without destroying it or in a performance sensitive path where the | ||
78 | + * caller just wants to wait for all tasks to complete while deferring the | ||
79 | + * pool free operation for later, less performance sensitive time. | ||
80 | + */ | ||
81 | +void thread_pool_wait(ThreadPool *pool); | ||
82 | + | ||
83 | +/* Set the maximum number of threads in the pool. */ | ||
84 | +bool thread_pool_set_max_threads(ThreadPool *pool, int max_threads); | ||
85 | + | ||
86 | +/* | ||
87 | + * Adjust the maximum number of threads in the pool to give each task its | ||
88 | + * own thread (exactly one thread per task). | ||
89 | + */ | ||
90 | +bool thread_pool_adjust_max_threads_to_work(ThreadPool *pool); | ||
91 | |||
92 | #endif | ||
93 | diff --git a/util/thread-pool.c b/util/thread-pool.c | ||
94 | index XXXXXXX..XXXXXXX 100644 | ||
95 | --- a/util/thread-pool.c | ||
96 | +++ b/util/thread-pool.c | ||
97 | @@ -XXX,XX +XXX,XX @@ void thread_pool_free_aio(ThreadPoolAio *pool) | ||
98 | qemu_mutex_destroy(&pool->lock); | ||
99 | g_free(pool); | ||
100 | } | ||
101 | + | ||
102 | +struct ThreadPool { | ||
103 | + GThreadPool *t; | ||
104 | + size_t cur_work; | ||
105 | + QemuMutex cur_work_lock; | ||
106 | + QemuCond all_finished_cond; | ||
107 | +}; | ||
108 | + | ||
109 | +typedef struct { | ||
110 | + ThreadPoolFunc *func; | ||
111 | + void *opaque; | ||
112 | + GDestroyNotify opaque_destroy; | ||
113 | +} ThreadPoolElement; | ||
114 | + | ||
115 | +static void thread_pool_func(gpointer data, gpointer user_data) | ||
116 | +{ | ||
117 | + ThreadPool *pool = user_data; | ||
118 | + g_autofree ThreadPoolElement *el = data; | ||
119 | + | ||
120 | + el->func(el->opaque); | ||
121 | + | ||
122 | + if (el->opaque_destroy) { | ||
123 | + el->opaque_destroy(el->opaque); | ||
124 | + } | ||
125 | + | ||
126 | + QEMU_LOCK_GUARD(&pool->cur_work_lock); | ||
127 | + | ||
128 | + assert(pool->cur_work > 0); | ||
129 | + pool->cur_work--; | ||
130 | + | ||
131 | + if (pool->cur_work == 0) { | ||
132 | + qemu_cond_signal(&pool->all_finished_cond); | ||
133 | + } | ||
134 | +} | ||
135 | + | ||
136 | +ThreadPool *thread_pool_new(void) | ||
137 | +{ | ||
138 | + ThreadPool *pool = g_new(ThreadPool, 1); | ||
139 | + | ||
140 | + pool->cur_work = 0; | ||
141 | + qemu_mutex_init(&pool->cur_work_lock); | ||
142 | + qemu_cond_init(&pool->all_finished_cond); | ||
143 | + | ||
144 | + pool->t = g_thread_pool_new(thread_pool_func, pool, 0, TRUE, NULL); | ||
145 | + /* | ||
146 | + * g_thread_pool_new() can only return errors if initial thread(s) | ||
147 | + * creation fails but we ask for 0 initial threads above. | ||
148 | + */ | ||
149 | + assert(pool->t); | ||
150 | + | ||
151 | + return pool; | ||
152 | +} | ||
153 | + | ||
154 | +void thread_pool_free(ThreadPool *pool) | ||
155 | +{ | ||
156 | + /* | ||
157 | + * With _wait = TRUE this effectively waits for all | ||
158 | + * previously submitted work to complete first. | ||
159 | + */ | ||
160 | + g_thread_pool_free(pool->t, FALSE, TRUE); | ||
161 | + | ||
162 | + qemu_cond_destroy(&pool->all_finished_cond); | ||
163 | + qemu_mutex_destroy(&pool->cur_work_lock); | ||
164 | + | ||
165 | + g_free(pool); | ||
166 | +} | ||
167 | + | ||
168 | +void thread_pool_submit(ThreadPool *pool, ThreadPoolFunc *func, | ||
169 | + void *opaque, GDestroyNotify opaque_destroy) | ||
170 | +{ | ||
171 | + ThreadPoolElement *el = g_new(ThreadPoolElement, 1); | ||
172 | + | ||
173 | + el->func = func; | ||
174 | + el->opaque = opaque; | ||
175 | + el->opaque_destroy = opaque_destroy; | ||
176 | + | ||
177 | + WITH_QEMU_LOCK_GUARD(&pool->cur_work_lock) { | ||
178 | + pool->cur_work++; | ||
179 | + } | ||
180 | + | ||
181 | + /* | ||
182 | + * Ignore the return value since this function can only return errors | ||
183 | + * if creation of an additional thread fails but even in this case the | ||
184 | + * provided work is still getting queued (just for the existing threads). | ||
185 | + */ | ||
186 | + g_thread_pool_push(pool->t, el, NULL); | ||
187 | +} | ||
188 | + | ||
189 | +void thread_pool_submit_immediate(ThreadPool *pool, ThreadPoolFunc *func, | ||
190 | + void *opaque, GDestroyNotify opaque_destroy) | ||
191 | +{ | ||
192 | + thread_pool_submit(pool, func, opaque, opaque_destroy); | ||
193 | + thread_pool_adjust_max_threads_to_work(pool); | ||
194 | +} | ||
195 | + | ||
196 | +void thread_pool_wait(ThreadPool *pool) | ||
197 | +{ | ||
198 | + QEMU_LOCK_GUARD(&pool->cur_work_lock); | ||
199 | + | ||
200 | + while (pool->cur_work > 0) { | ||
201 | + qemu_cond_wait(&pool->all_finished_cond, | ||
202 | + &pool->cur_work_lock); | ||
203 | + } | ||
204 | +} | ||
205 | + | ||
206 | +bool thread_pool_set_max_threads(ThreadPool *pool, | ||
207 | + int max_threads) | ||
208 | +{ | ||
209 | + assert(max_threads > 0); | ||
210 | + | ||
211 | + return g_thread_pool_set_max_threads(pool->t, max_threads, NULL); | ||
212 | +} | ||
213 | + | ||
214 | +bool thread_pool_adjust_max_threads_to_work(ThreadPool *pool) | ||
215 | +{ | ||
216 | + QEMU_LOCK_GUARD(&pool->cur_work_lock); | ||
217 | + | ||
218 | + return thread_pool_set_max_threads(pool, pool->cur_work); | ||
219 | +} | ||
220 | -- | ||
221 | 2.48.1 | ||
222 | |||
223 | diff view generated by jsdifflib |
Deleted patch | |||
---|---|---|---|
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | ||
2 | 1 | ||
3 | This QEMU_VM_COMMAND sub-command and its switchover_start SaveVMHandler is | ||
4 | used to mark the switchover point in main migration stream. | ||
5 | |||
6 | It can be used to inform the destination that all pre-switchover main | ||
7 | migration stream data has been sent/received so it can start to process | ||
8 | post-switchover data that it might have received via other migration | ||
9 | channels like the multifd ones. | ||
10 | |||
11 | Add also the relevant MigrationState bit stream compatibility property and | ||
12 | its hw_compat entry. | ||
13 | |||
14 | Reviewed-by: Fabiano Rosas <farosas@suse.de> | ||
15 | Reviewed-by: Zhang Chen <zhangckid@gmail.com> # for the COLO part | ||
16 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | ||
17 | Link: https://lore.kernel.org/qemu-devel/311be6da85fc7e49a7598684d80aa631778dcbce.1741124640.git.maciej.szmigiero@oracle.com | ||
18 | Signed-off-by: Cédric Le Goater <clg@redhat.com> | ||
19 | --- | ||
20 | include/migration/client-options.h | 4 +++ | ||
21 | include/migration/register.h | 12 +++++++++ | ||
22 | migration/migration.h | 2 ++ | ||
23 | migration/savevm.h | 1 + | ||
24 | hw/core/machine.c | 1 + | ||
25 | migration/colo.c | 3 +++ | ||
26 | migration/migration-hmp-cmds.c | 2 ++ | ||
27 | migration/migration.c | 2 ++ | ||
28 | migration/options.c | 9 +++++++ | ||
29 | migration/savevm.c | 39 ++++++++++++++++++++++++++++++ | ||
30 | migration/trace-events | 1 + | ||
31 | scripts/analyze-migration.py | 11 +++++++++ | ||
32 | 12 files changed, 87 insertions(+) | ||
33 | |||
34 | diff --git a/include/migration/client-options.h b/include/migration/client-options.h | ||
35 | index XXXXXXX..XXXXXXX 100644 | ||
36 | --- a/include/migration/client-options.h | ||
37 | +++ b/include/migration/client-options.h | ||
38 | @@ -XXX,XX +XXX,XX @@ | ||
39 | #ifndef QEMU_MIGRATION_CLIENT_OPTIONS_H | ||
40 | #define QEMU_MIGRATION_CLIENT_OPTIONS_H | ||
41 | |||
42 | + | ||
43 | +/* properties */ | ||
44 | +bool migrate_send_switchover_start(void); | ||
45 | + | ||
46 | /* capabilities */ | ||
47 | |||
48 | bool migrate_background_snapshot(void); | ||
49 | diff --git a/include/migration/register.h b/include/migration/register.h | ||
50 | index XXXXXXX..XXXXXXX 100644 | ||
51 | --- a/include/migration/register.h | ||
52 | +++ b/include/migration/register.h | ||
53 | @@ -XXX,XX +XXX,XX @@ typedef struct SaveVMHandlers { | ||
54 | * otherwise | ||
55 | */ | ||
56 | bool (*switchover_ack_needed)(void *opaque); | ||
57 | + | ||
58 | + /** | ||
59 | + * @switchover_start | ||
60 | + * | ||
61 | + * Notifies that the switchover has started. Called only on | ||
62 | + * the destination. | ||
63 | + * | ||
64 | + * @opaque: data pointer passed to register_savevm_live() | ||
65 | + * | ||
66 | + * Returns zero to indicate success and negative for error | ||
67 | + */ | ||
68 | + int (*switchover_start)(void *opaque); | ||
69 | } SaveVMHandlers; | ||
70 | |||
71 | /** | ||
72 | diff --git a/migration/migration.h b/migration/migration.h | ||
73 | index XXXXXXX..XXXXXXX 100644 | ||
74 | --- a/migration/migration.h | ||
75 | +++ b/migration/migration.h | ||
76 | @@ -XXX,XX +XXX,XX @@ struct MigrationState { | ||
77 | bool send_configuration; | ||
78 | /* Whether we send section footer during migration */ | ||
79 | bool send_section_footer; | ||
80 | + /* Whether we send switchover start notification during migration */ | ||
81 | + bool send_switchover_start; | ||
82 | |||
83 | /* Needed by postcopy-pause state */ | ||
84 | QemuSemaphore postcopy_pause_sem; | ||
85 | diff --git a/migration/savevm.h b/migration/savevm.h | ||
86 | index XXXXXXX..XXXXXXX 100644 | ||
87 | --- a/migration/savevm.h | ||
88 | +++ b/migration/savevm.h | ||
89 | @@ -XXX,XX +XXX,XX @@ void qemu_savevm_send_postcopy_listen(QEMUFile *f); | ||
90 | void qemu_savevm_send_postcopy_run(QEMUFile *f); | ||
91 | void qemu_savevm_send_postcopy_resume(QEMUFile *f); | ||
92 | void qemu_savevm_send_recv_bitmap(QEMUFile *f, char *block_name); | ||
93 | +void qemu_savevm_maybe_send_switchover_start(QEMUFile *f); | ||
94 | |||
95 | void qemu_savevm_send_postcopy_ram_discard(QEMUFile *f, const char *name, | ||
96 | uint16_t len, | ||
97 | diff --git a/hw/core/machine.c b/hw/core/machine.c | ||
98 | index XXXXXXX..XXXXXXX 100644 | ||
99 | --- a/hw/core/machine.c | ||
100 | +++ b/hw/core/machine.c | ||
101 | @@ -XXX,XX +XXX,XX @@ GlobalProperty hw_compat_9_2[] = { | ||
102 | { "virtio-balloon-pci-non-transitional", "vectors", "0" }, | ||
103 | { "virtio-mem-pci", "vectors", "0" }, | ||
104 | { "migration", "multifd-clean-tls-termination", "false" }, | ||
105 | + { "migration", "send-switchover-start", "off"}, | ||
106 | }; | ||
107 | const size_t hw_compat_9_2_len = G_N_ELEMENTS(hw_compat_9_2); | ||
108 | |||
109 | diff --git a/migration/colo.c b/migration/colo.c | ||
110 | index XXXXXXX..XXXXXXX 100644 | ||
111 | --- a/migration/colo.c | ||
112 | +++ b/migration/colo.c | ||
113 | @@ -XXX,XX +XXX,XX @@ static int colo_do_checkpoint_transaction(MigrationState *s, | ||
114 | bql_unlock(); | ||
115 | goto out; | ||
116 | } | ||
117 | + | ||
118 | + qemu_savevm_maybe_send_switchover_start(s->to_dst_file); | ||
119 | + | ||
120 | /* Note: device state is saved into buffer */ | ||
121 | ret = qemu_save_device_state(fb); | ||
122 | |||
123 | diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c | ||
124 | index XXXXXXX..XXXXXXX 100644 | ||
125 | --- a/migration/migration-hmp-cmds.c | ||
126 | +++ b/migration/migration-hmp-cmds.c | ||
127 | @@ -XXX,XX +XXX,XX @@ static void migration_global_dump(Monitor *mon) | ||
128 | ms->send_configuration ? "on" : "off"); | ||
129 | monitor_printf(mon, "send-section-footer: %s\n", | ||
130 | ms->send_section_footer ? "on" : "off"); | ||
131 | + monitor_printf(mon, "send-switchover-start: %s\n", | ||
132 | + ms->send_switchover_start ? "on" : "off"); | ||
133 | monitor_printf(mon, "clear-bitmap-shift: %u\n", | ||
134 | ms->clear_bitmap_shift); | ||
135 | } | ||
136 | diff --git a/migration/migration.c b/migration/migration.c | ||
137 | index XXXXXXX..XXXXXXX 100644 | ||
138 | --- a/migration/migration.c | ||
139 | +++ b/migration/migration.c | ||
140 | @@ -XXX,XX +XXX,XX @@ static bool migration_switchover_start(MigrationState *s, Error **errp) | ||
141 | |||
142 | precopy_notify_complete(); | ||
143 | |||
144 | + qemu_savevm_maybe_send_switchover_start(s->to_dst_file); | ||
145 | + | ||
146 | return true; | ||
147 | } | ||
148 | |||
149 | diff --git a/migration/options.c b/migration/options.c | ||
150 | index XXXXXXX..XXXXXXX 100644 | ||
151 | --- a/migration/options.c | ||
152 | +++ b/migration/options.c | ||
153 | @@ -XXX,XX +XXX,XX @@ const Property migration_properties[] = { | ||
154 | send_configuration, true), | ||
155 | DEFINE_PROP_BOOL("send-section-footer", MigrationState, | ||
156 | send_section_footer, true), | ||
157 | + DEFINE_PROP_BOOL("send-switchover-start", MigrationState, | ||
158 | + send_switchover_start, true), | ||
159 | DEFINE_PROP_BOOL("multifd-flush-after-each-section", MigrationState, | ||
160 | multifd_flush_after_each_section, false), | ||
161 | DEFINE_PROP_UINT8("x-clear-bitmap-shift", MigrationState, | ||
162 | @@ -XXX,XX +XXX,XX @@ bool migrate_auto_converge(void) | ||
163 | return s->capabilities[MIGRATION_CAPABILITY_AUTO_CONVERGE]; | ||
164 | } | ||
165 | |||
166 | +bool migrate_send_switchover_start(void) | ||
167 | +{ | ||
168 | + MigrationState *s = migrate_get_current(); | ||
169 | + | ||
170 | + return s->send_switchover_start; | ||
171 | +} | ||
172 | + | ||
173 | bool migrate_background_snapshot(void) | ||
174 | { | ||
175 | MigrationState *s = migrate_get_current(); | ||
176 | diff --git a/migration/savevm.c b/migration/savevm.c | ||
177 | index XXXXXXX..XXXXXXX 100644 | ||
178 | --- a/migration/savevm.c | ||
179 | +++ b/migration/savevm.c | ||
180 | @@ -XXX,XX +XXX,XX @@ enum qemu_vm_cmd { | ||
181 | MIG_CMD_ENABLE_COLO, /* Enable COLO */ | ||
182 | MIG_CMD_POSTCOPY_RESUME, /* resume postcopy on dest */ | ||
183 | MIG_CMD_RECV_BITMAP, /* Request for recved bitmap on dst */ | ||
184 | + MIG_CMD_SWITCHOVER_START, /* Switchover start notification */ | ||
185 | MIG_CMD_MAX | ||
186 | }; | ||
187 | |||
188 | @@ -XXX,XX +XXX,XX @@ static struct mig_cmd_args { | ||
189 | [MIG_CMD_POSTCOPY_RESUME] = { .len = 0, .name = "POSTCOPY_RESUME" }, | ||
190 | [MIG_CMD_PACKAGED] = { .len = 4, .name = "PACKAGED" }, | ||
191 | [MIG_CMD_RECV_BITMAP] = { .len = -1, .name = "RECV_BITMAP" }, | ||
192 | + [MIG_CMD_SWITCHOVER_START] = { .len = 0, .name = "SWITCHOVER_START" }, | ||
193 | [MIG_CMD_MAX] = { .len = -1, .name = "MAX" }, | ||
194 | }; | ||
195 | |||
196 | @@ -XXX,XX +XXX,XX @@ void qemu_savevm_send_recv_bitmap(QEMUFile *f, char *block_name) | ||
197 | qemu_savevm_command_send(f, MIG_CMD_RECV_BITMAP, len + 1, (uint8_t *)buf); | ||
198 | } | ||
199 | |||
200 | +static void qemu_savevm_send_switchover_start(QEMUFile *f) | ||
201 | +{ | ||
202 | + trace_savevm_send_switchover_start(); | ||
203 | + qemu_savevm_command_send(f, MIG_CMD_SWITCHOVER_START, 0, NULL); | ||
204 | +} | ||
205 | + | ||
206 | +void qemu_savevm_maybe_send_switchover_start(QEMUFile *f) | ||
207 | +{ | ||
208 | + if (migrate_send_switchover_start()) { | ||
209 | + qemu_savevm_send_switchover_start(f); | ||
210 | + } | ||
211 | +} | ||
212 | + | ||
213 | bool qemu_savevm_state_blocked(Error **errp) | ||
214 | { | ||
215 | SaveStateEntry *se; | ||
216 | @@ -XXX,XX +XXX,XX @@ static int qemu_savevm_state(QEMUFile *f, Error **errp) | ||
217 | |||
218 | ret = qemu_file_get_error(f); | ||
219 | if (ret == 0) { | ||
220 | + qemu_savevm_maybe_send_switchover_start(f); | ||
221 | qemu_savevm_state_complete_precopy(f, false); | ||
222 | ret = qemu_file_get_error(f); | ||
223 | } | ||
224 | @@ -XXX,XX +XXX,XX @@ static int loadvm_process_enable_colo(MigrationIncomingState *mis) | ||
225 | return ret; | ||
226 | } | ||
227 | |||
228 | +static int loadvm_postcopy_handle_switchover_start(void) | ||
229 | +{ | ||
230 | + SaveStateEntry *se; | ||
231 | + | ||
232 | + QTAILQ_FOREACH(se, &savevm_state.handlers, entry) { | ||
233 | + int ret; | ||
234 | + | ||
235 | + if (!se->ops || !se->ops->switchover_start) { | ||
236 | + continue; | ||
237 | + } | ||
238 | + | ||
239 | + ret = se->ops->switchover_start(se->opaque); | ||
240 | + if (ret < 0) { | ||
241 | + return ret; | ||
242 | + } | ||
243 | + } | ||
244 | + | ||
245 | + return 0; | ||
246 | +} | ||
247 | + | ||
248 | /* | ||
249 | * Process an incoming 'QEMU_VM_COMMAND' | ||
250 | * 0 just a normal return | ||
251 | @@ -XXX,XX +XXX,XX @@ static int loadvm_process_command(QEMUFile *f) | ||
252 | |||
253 | case MIG_CMD_ENABLE_COLO: | ||
254 | return loadvm_process_enable_colo(mis); | ||
255 | + | ||
256 | + case MIG_CMD_SWITCHOVER_START: | ||
257 | + return loadvm_postcopy_handle_switchover_start(); | ||
258 | } | ||
259 | |||
260 | return 0; | ||
261 | diff --git a/migration/trace-events b/migration/trace-events | ||
262 | index XXXXXXX..XXXXXXX 100644 | ||
263 | --- a/migration/trace-events | ||
264 | +++ b/migration/trace-events | ||
265 | @@ -XXX,XX +XXX,XX @@ savevm_send_postcopy_run(void) "" | ||
266 | savevm_send_postcopy_resume(void) "" | ||
267 | savevm_send_colo_enable(void) "" | ||
268 | savevm_send_recv_bitmap(char *name) "%s" | ||
269 | +savevm_send_switchover_start(void) "" | ||
270 | savevm_state_setup(void) "" | ||
271 | savevm_state_resume_prepare(void) "" | ||
272 | savevm_state_header(void) "" | ||
273 | diff --git a/scripts/analyze-migration.py b/scripts/analyze-migration.py | ||
274 | index XXXXXXX..XXXXXXX 100755 | ||
275 | --- a/scripts/analyze-migration.py | ||
276 | +++ b/scripts/analyze-migration.py | ||
277 | @@ -XXX,XX +XXX,XX @@ class MigrationDump(object): | ||
278 | QEMU_VM_SUBSECTION = 0x05 | ||
279 | QEMU_VM_VMDESCRIPTION = 0x06 | ||
280 | QEMU_VM_CONFIGURATION = 0x07 | ||
281 | + QEMU_VM_COMMAND = 0x08 | ||
282 | QEMU_VM_SECTION_FOOTER= 0x7e | ||
283 | + QEMU_MIG_CMD_SWITCHOVER_START = 0x0b | ||
284 | |||
285 | def __init__(self, filename): | ||
286 | self.section_classes = { | ||
287 | @@ -XXX,XX +XXX,XX @@ def read(self, desc_only = False, dump_memory = False, | ||
288 | elif section_type == self.QEMU_VM_SECTION_PART or section_type == self.QEMU_VM_SECTION_END: | ||
289 | section_id = file.read32() | ||
290 | self.sections[section_id].read() | ||
291 | + elif section_type == self.QEMU_VM_COMMAND: | ||
292 | + command_type = file.read16() | ||
293 | + command_data_len = file.read16() | ||
294 | + if command_type != self.QEMU_MIG_CMD_SWITCHOVER_START: | ||
295 | + raise Exception("Unknown QEMU_VM_COMMAND: %x" % | ||
296 | + (command_type)) | ||
297 | + if command_data_len != 0: | ||
298 | + raise Exception("Invalid SWITCHOVER_START length: %x" % | ||
299 | + (command_data_len)) | ||
300 | elif section_type == self.QEMU_VM_SECTION_FOOTER: | ||
301 | read_section_id = file.read32() | ||
302 | if read_section_id != section_id: | ||
303 | -- | ||
304 | 2.48.1 | ||
305 | |||
306 | diff view generated by jsdifflib |
Deleted patch | |||
---|---|---|---|
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | ||
2 | 1 | ||
3 | qemu_loadvm_load_state_buffer() and its load_state_buffer | ||
4 | SaveVMHandler allow providing device state buffer to explicitly | ||
5 | specified device via its idstr and instance id. | ||
6 | |||
7 | Reviewed-by: Fabiano Rosas <farosas@suse.de> | ||
8 | Reviewed-by: Peter Xu <peterx@redhat.com> | ||
9 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | ||
10 | Link: https://lore.kernel.org/qemu-devel/71ca753286b87831ced4afd422e2e2bed071af25.1741124640.git.maciej.szmigiero@oracle.com | ||
11 | Signed-off-by: Cédric Le Goater <clg@redhat.com> | ||
12 | --- | ||
13 | include/migration/register.h | 15 +++++++++++++++ | ||
14 | migration/savevm.h | 3 +++ | ||
15 | migration/savevm.c | 23 +++++++++++++++++++++++ | ||
16 | 3 files changed, 41 insertions(+) | ||
17 | |||
18 | diff --git a/include/migration/register.h b/include/migration/register.h | ||
19 | index XXXXXXX..XXXXXXX 100644 | ||
20 | --- a/include/migration/register.h | ||
21 | +++ b/include/migration/register.h | ||
22 | @@ -XXX,XX +XXX,XX @@ typedef struct SaveVMHandlers { | ||
23 | */ | ||
24 | int (*load_state)(QEMUFile *f, void *opaque, int version_id); | ||
25 | |||
26 | + /** | ||
27 | + * @load_state_buffer (invoked outside the BQL) | ||
28 | + * | ||
29 | + * Load device state buffer provided to qemu_loadvm_load_state_buffer(). | ||
30 | + * | ||
31 | + * @opaque: data pointer passed to register_savevm_live() | ||
32 | + * @buf: the data buffer to load | ||
33 | + * @len: the data length in buffer | ||
34 | + * @errp: pointer to Error*, to store an error if it happens. | ||
35 | + * | ||
36 | + * Returns true to indicate success and false for errors. | ||
37 | + */ | ||
38 | + bool (*load_state_buffer)(void *opaque, char *buf, size_t len, | ||
39 | + Error **errp); | ||
40 | + | ||
41 | /** | ||
42 | * @load_setup | ||
43 | * | ||
44 | diff --git a/migration/savevm.h b/migration/savevm.h | ||
45 | index XXXXXXX..XXXXXXX 100644 | ||
46 | --- a/migration/savevm.h | ||
47 | +++ b/migration/savevm.h | ||
48 | @@ -XXX,XX +XXX,XX @@ int qemu_loadvm_approve_switchover(void); | ||
49 | int qemu_savevm_state_complete_precopy_non_iterable(QEMUFile *f, | ||
50 | bool in_postcopy); | ||
51 | |||
52 | +bool qemu_loadvm_load_state_buffer(const char *idstr, uint32_t instance_id, | ||
53 | + char *buf, size_t len, Error **errp); | ||
54 | + | ||
55 | #endif | ||
56 | diff --git a/migration/savevm.c b/migration/savevm.c | ||
57 | index XXXXXXX..XXXXXXX 100644 | ||
58 | --- a/migration/savevm.c | ||
59 | +++ b/migration/savevm.c | ||
60 | @@ -XXX,XX +XXX,XX @@ int qemu_loadvm_approve_switchover(void) | ||
61 | return migrate_send_rp_switchover_ack(mis); | ||
62 | } | ||
63 | |||
64 | +bool qemu_loadvm_load_state_buffer(const char *idstr, uint32_t instance_id, | ||
65 | + char *buf, size_t len, Error **errp) | ||
66 | +{ | ||
67 | + SaveStateEntry *se; | ||
68 | + | ||
69 | + se = find_se(idstr, instance_id); | ||
70 | + if (!se) { | ||
71 | + error_setg(errp, | ||
72 | + "Unknown idstr %s or instance id %u for load state buffer", | ||
73 | + idstr, instance_id); | ||
74 | + return false; | ||
75 | + } | ||
76 | + | ||
77 | + if (!se->ops || !se->ops->load_state_buffer) { | ||
78 | + error_setg(errp, | ||
79 | + "idstr %s / instance %u has no load state buffer operation", | ||
80 | + idstr, instance_id); | ||
81 | + return false; | ||
82 | + } | ||
83 | + | ||
84 | + return se->ops->load_state_buffer(se->opaque, buf, len, errp); | ||
85 | +} | ||
86 | + | ||
87 | bool save_snapshot(const char *name, bool overwrite, const char *vmstate, | ||
88 | bool has_devices, strList *devices, Error **errp) | ||
89 | { | ||
90 | -- | ||
91 | 2.48.1 | ||
92 | |||
93 | diff view generated by jsdifflib |
Deleted patch | |||
---|---|---|---|
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | ||
2 | 1 | ||
3 | All callers to migration_incoming_state_destroy() other than | ||
4 | postcopy_ram_listen_thread() do this call with BQL held. | ||
5 | |||
6 | Since migration_incoming_state_destroy() ultimately calls "load_cleanup" | ||
7 | SaveVMHandlers and it will soon call BQL-sensitive code it makes sense | ||
8 | to always call that function under BQL rather than to have it deal with | ||
9 | both cases (with BQL and without BQL). | ||
10 | Add the necessary bql_lock() and bql_unlock() to | ||
11 | postcopy_ram_listen_thread(). | ||
12 | |||
13 | qemu_loadvm_state_main() in postcopy_ram_listen_thread() could call | ||
14 | "load_state" SaveVMHandlers that are expecting BQL to be held. | ||
15 | |||
16 | In principle, the only devices that should be arriving on migration | ||
17 | channel serviced by postcopy_ram_listen_thread() are those that are | ||
18 | postcopiable and whose load handlers are safe to be called without BQL | ||
19 | being held. | ||
20 | |||
21 | But nothing currently prevents the source from sending data for "unsafe" | ||
22 | devices which would cause trouble there. | ||
23 | Add a TODO comment there so it's clear that it would be good to improve | ||
24 | handling of such (erroneous) case in the future. | ||
25 | |||
26 | Acked-by: Peter Xu <peterx@redhat.com> | ||
27 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | ||
28 | Link: https://lore.kernel.org/qemu-devel/21bb5ca337b1d5a802e697f553f37faf296b5ff4.1741193259.git.maciej.szmigiero@oracle.com | ||
29 | Signed-off-by: Cédric Le Goater <clg@redhat.com> | ||
30 | --- | ||
31 | migration/migration.c | 13 +++++++++++++ | ||
32 | migration/savevm.c | 4 ++++ | ||
33 | 2 files changed, 17 insertions(+) | ||
34 | |||
35 | diff --git a/migration/migration.c b/migration/migration.c | ||
36 | index XXXXXXX..XXXXXXX 100644 | ||
37 | --- a/migration/migration.c | ||
38 | +++ b/migration/migration.c | ||
39 | @@ -XXX,XX +XXX,XX @@ void migration_incoming_state_destroy(void) | ||
40 | struct MigrationIncomingState *mis = migration_incoming_get_current(); | ||
41 | |||
42 | multifd_recv_cleanup(); | ||
43 | + | ||
44 | /* | ||
45 | * RAM state cleanup needs to happen after multifd cleanup, because | ||
46 | * multifd threads can use some of its states (receivedmap). | ||
47 | + * The VFIO load_cleanup() implementation is BQL-sensitive. It requires | ||
48 | + * BQL must NOT be taken when recycling load threads, so that it won't | ||
49 | + * block the load threads from making progress on address space | ||
50 | + * modification operations. | ||
51 | + * | ||
52 | + * To make it work, we could try to not take BQL for all load_cleanup(), | ||
53 | + * or conditionally unlock BQL only if bql_locked() in VFIO. | ||
54 | + * | ||
55 | + * Since most existing call sites take BQL for load_cleanup(), make | ||
56 | + * it simple by taking BQL always as the rule, so that VFIO can unlock | ||
57 | + * BQL and retake unconditionally. | ||
58 | */ | ||
59 | + assert(bql_locked()); | ||
60 | qemu_loadvm_state_cleanup(); | ||
61 | |||
62 | if (mis->to_src_file) { | ||
63 | diff --git a/migration/savevm.c b/migration/savevm.c | ||
64 | index XXXXXXX..XXXXXXX 100644 | ||
65 | --- a/migration/savevm.c | ||
66 | +++ b/migration/savevm.c | ||
67 | @@ -XXX,XX +XXX,XX @@ static void *postcopy_ram_listen_thread(void *opaque) | ||
68 | * in qemu_file, and thus we must be blocking now. | ||
69 | */ | ||
70 | qemu_file_set_blocking(f, true); | ||
71 | + | ||
72 | + /* TODO: sanity check that only postcopiable data will be loaded here */ | ||
73 | load_res = qemu_loadvm_state_main(f, mis); | ||
74 | |||
75 | /* | ||
76 | @@ -XXX,XX +XXX,XX @@ static void *postcopy_ram_listen_thread(void *opaque) | ||
77 | * (If something broke then qemu will have to exit anyway since it's | ||
78 | * got a bad migration state). | ||
79 | */ | ||
80 | + bql_lock(); | ||
81 | migration_incoming_state_destroy(); | ||
82 | + bql_unlock(); | ||
83 | |||
84 | rcu_unregister_thread(); | ||
85 | mis->have_listen_thread = false; | ||
86 | -- | ||
87 | 2.48.1 | ||
88 | |||
89 | diff view generated by jsdifflib |
Deleted patch | |||
---|---|---|---|
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | ||
2 | 1 | ||
3 | Automatic memory management helps avoid memory safety issues. | ||
4 | |||
5 | Reviewed-by: Peter Xu <peterx@redhat.com> | ||
6 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | ||
7 | Link: https://lore.kernel.org/qemu-devel/a5843c5fa64d7e5239a4316092ec0ef0d10c2320.1741124640.git.maciej.szmigiero@oracle.com | ||
8 | Signed-off-by: Cédric Le Goater <clg@redhat.com> | ||
9 | --- | ||
10 | include/qapi/error.h | 2 ++ | ||
11 | 1 file changed, 2 insertions(+) | ||
12 | |||
13 | diff --git a/include/qapi/error.h b/include/qapi/error.h | ||
14 | index XXXXXXX..XXXXXXX 100644 | ||
15 | --- a/include/qapi/error.h | ||
16 | +++ b/include/qapi/error.h | ||
17 | @@ -XXX,XX +XXX,XX @@ Error *error_copy(const Error *err); | ||
18 | */ | ||
19 | void error_free(Error *err); | ||
20 | |||
21 | +G_DEFINE_AUTOPTR_CLEANUP_FUNC(Error, error_free) | ||
22 | + | ||
23 | /* | ||
24 | * Convenience function to assert that *@errp is set, then silently free it. | ||
25 | */ | ||
26 | -- | ||
27 | 2.48.1 | ||
28 | |||
29 | diff view generated by jsdifflib |
Deleted patch | |||
---|---|---|---|
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | ||
2 | 1 | ||
3 | Some drivers might want to make use of auxiliary helper threads during VM | ||
4 | state loading, for example to make sure that their blocking (sync) I/O | ||
5 | operations don't block the rest of the migration process. | ||
6 | |||
7 | Add a migration core managed thread pool to facilitate this use case. | ||
8 | |||
9 | The migration core will wait for these threads to finish before | ||
10 | (re)starting the VM at destination. | ||
11 | |||
12 | Reviewed-by: Fabiano Rosas <farosas@suse.de> | ||
13 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | ||
14 | Link: https://lore.kernel.org/qemu-devel/b09fd70369b6159c75847e69f235cb908b02570c.1741124640.git.maciej.szmigiero@oracle.com | ||
15 | Signed-off-by: Cédric Le Goater <clg@redhat.com> | ||
16 | --- | ||
17 | include/migration/misc.h | 3 ++ | ||
18 | include/qemu/typedefs.h | 2 + | ||
19 | migration/migration.h | 5 +++ | ||
20 | migration/savevm.h | 2 +- | ||
21 | migration/migration.c | 2 +- | ||
22 | migration/savevm.c | 95 +++++++++++++++++++++++++++++++++++++++- | ||
23 | 6 files changed, 105 insertions(+), 4 deletions(-) | ||
24 | |||
25 | diff --git a/include/migration/misc.h b/include/migration/misc.h | ||
26 | index XXXXXXX..XXXXXXX 100644 | ||
27 | --- a/include/migration/misc.h | ||
28 | +++ b/include/migration/misc.h | ||
29 | @@ -XXX,XX +XXX,XX @@ bool migrate_ram_is_ignored(RAMBlock *block); | ||
30 | /* migration/block.c */ | ||
31 | |||
32 | AnnounceParameters *migrate_announce_params(void); | ||
33 | + | ||
34 | /* migration/savevm.c */ | ||
35 | |||
36 | void dump_vmstate_json_to_file(FILE *out_fp); | ||
37 | +void qemu_loadvm_start_load_thread(MigrationLoadThread function, | ||
38 | + void *opaque); | ||
39 | |||
40 | /* migration/migration.c */ | ||
41 | void migration_object_init(void); | ||
42 | diff --git a/include/qemu/typedefs.h b/include/qemu/typedefs.h | ||
43 | index XXXXXXX..XXXXXXX 100644 | ||
44 | --- a/include/qemu/typedefs.h | ||
45 | +++ b/include/qemu/typedefs.h | ||
46 | @@ -XXX,XX +XXX,XX @@ typedef struct IRQState *qemu_irq; | ||
47 | * Function types | ||
48 | */ | ||
49 | typedef void (*qemu_irq_handler)(void *opaque, int n, int level); | ||
50 | +typedef bool (*MigrationLoadThread)(void *opaque, bool *should_quit, | ||
51 | + Error **errp); | ||
52 | |||
53 | #endif /* QEMU_TYPEDEFS_H */ | ||
54 | diff --git a/migration/migration.h b/migration/migration.h | ||
55 | index XXXXXXX..XXXXXXX 100644 | ||
56 | --- a/migration/migration.h | ||
57 | +++ b/migration/migration.h | ||
58 | @@ -XXX,XX +XXX,XX @@ | ||
59 | #define MIGRATION_THREAD_DST_PREEMPT "mig/dst/preempt" | ||
60 | |||
61 | struct PostcopyBlocktimeContext; | ||
62 | +typedef struct ThreadPool ThreadPool; | ||
63 | |||
64 | #define MIGRATION_RESUME_ACK_VALUE (1) | ||
65 | |||
66 | @@ -XXX,XX +XXX,XX @@ struct MigrationIncomingState { | ||
67 | Coroutine *colo_incoming_co; | ||
68 | QemuSemaphore colo_incoming_sem; | ||
69 | |||
70 | + /* Optional load threads pool and its thread exit request flag */ | ||
71 | + ThreadPool *load_threads; | ||
72 | + bool load_threads_abort; | ||
73 | + | ||
74 | /* | ||
75 | * PostcopyBlocktimeContext to keep information for postcopy | ||
76 | * live migration, to calculate vCPU block time | ||
77 | diff --git a/migration/savevm.h b/migration/savevm.h | ||
78 | index XXXXXXX..XXXXXXX 100644 | ||
79 | --- a/migration/savevm.h | ||
80 | +++ b/migration/savevm.h | ||
81 | @@ -XXX,XX +XXX,XX @@ void qemu_savevm_live_state(QEMUFile *f); | ||
82 | int qemu_save_device_state(QEMUFile *f); | ||
83 | |||
84 | int qemu_loadvm_state(QEMUFile *f); | ||
85 | -void qemu_loadvm_state_cleanup(void); | ||
86 | +void qemu_loadvm_state_cleanup(MigrationIncomingState *mis); | ||
87 | int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis); | ||
88 | int qemu_load_device_state(QEMUFile *f); | ||
89 | int qemu_loadvm_approve_switchover(void); | ||
90 | diff --git a/migration/migration.c b/migration/migration.c | ||
91 | index XXXXXXX..XXXXXXX 100644 | ||
92 | --- a/migration/migration.c | ||
93 | +++ b/migration/migration.c | ||
94 | @@ -XXX,XX +XXX,XX @@ void migration_incoming_state_destroy(void) | ||
95 | * BQL and retake unconditionally. | ||
96 | */ | ||
97 | assert(bql_locked()); | ||
98 | - qemu_loadvm_state_cleanup(); | ||
99 | + qemu_loadvm_state_cleanup(mis); | ||
100 | |||
101 | if (mis->to_src_file) { | ||
102 | /* Tell source that we are done */ | ||
103 | diff --git a/migration/savevm.c b/migration/savevm.c | ||
104 | index XXXXXXX..XXXXXXX 100644 | ||
105 | --- a/migration/savevm.c | ||
106 | +++ b/migration/savevm.c | ||
107 | @@ -XXX,XX +XXX,XX @@ | ||
108 | #include "qemu/job.h" | ||
109 | #include "qemu/main-loop.h" | ||
110 | #include "block/snapshot.h" | ||
111 | +#include "block/thread-pool.h" | ||
112 | #include "qemu/cutils.h" | ||
113 | #include "io/channel-buffer.h" | ||
114 | #include "io/channel-file.h" | ||
115 | @@ -XXX,XX +XXX,XX @@ static struct mig_cmd_args { | ||
116 | * generic extendable format with an exception for two old entities. | ||
117 | */ | ||
118 | |||
119 | +/***********************************************************/ | ||
120 | +/* Optional load threads pool support */ | ||
121 | + | ||
122 | +static void qemu_loadvm_thread_pool_create(MigrationIncomingState *mis) | ||
123 | +{ | ||
124 | + assert(!mis->load_threads); | ||
125 | + mis->load_threads = thread_pool_new(); | ||
126 | + mis->load_threads_abort = false; | ||
127 | +} | ||
128 | + | ||
129 | +static void qemu_loadvm_thread_pool_destroy(MigrationIncomingState *mis) | ||
130 | +{ | ||
131 | + qatomic_set(&mis->load_threads_abort, true); | ||
132 | + | ||
133 | + bql_unlock(); /* Load threads might be waiting for BQL */ | ||
134 | + g_clear_pointer(&mis->load_threads, thread_pool_free); | ||
135 | + bql_lock(); | ||
136 | +} | ||
137 | + | ||
138 | +static bool qemu_loadvm_thread_pool_wait(MigrationState *s, | ||
139 | + MigrationIncomingState *mis) | ||
140 | +{ | ||
141 | + bql_unlock(); /* Let load threads do work requiring BQL */ | ||
142 | + thread_pool_wait(mis->load_threads); | ||
143 | + bql_lock(); | ||
144 | + | ||
145 | + return !migrate_has_error(s); | ||
146 | +} | ||
147 | + | ||
148 | /***********************************************************/ | ||
149 | /* savevm/loadvm support */ | ||
150 | |||
151 | @@ -XXX,XX +XXX,XX @@ static int qemu_loadvm_state_setup(QEMUFile *f, Error **errp) | ||
152 | return 0; | ||
153 | } | ||
154 | |||
155 | -void qemu_loadvm_state_cleanup(void) | ||
156 | +struct LoadThreadData { | ||
157 | + MigrationLoadThread function; | ||
158 | + void *opaque; | ||
159 | +}; | ||
160 | + | ||
161 | +static int qemu_loadvm_load_thread(void *thread_opaque) | ||
162 | +{ | ||
163 | + struct LoadThreadData *data = thread_opaque; | ||
164 | + MigrationIncomingState *mis = migration_incoming_get_current(); | ||
165 | + g_autoptr(Error) local_err = NULL; | ||
166 | + | ||
167 | + if (!data->function(data->opaque, &mis->load_threads_abort, &local_err)) { | ||
168 | + MigrationState *s = migrate_get_current(); | ||
169 | + | ||
170 | + /* | ||
171 | + * Can't set load_threads_abort here since processing of main migration | ||
172 | + * channel data could still be happening, resulting in launching of new | ||
173 | + * load threads. | ||
174 | + */ | ||
175 | + | ||
176 | + assert(local_err); | ||
177 | + | ||
178 | + /* | ||
179 | + * In case of multiple load threads failing which thread error | ||
180 | + * return we end setting is purely arbitrary. | ||
181 | + */ | ||
182 | + migrate_set_error(s, local_err); | ||
183 | + } | ||
184 | + | ||
185 | + return 0; | ||
186 | +} | ||
187 | + | ||
188 | +void qemu_loadvm_start_load_thread(MigrationLoadThread function, | ||
189 | + void *opaque) | ||
190 | +{ | ||
191 | + MigrationIncomingState *mis = migration_incoming_get_current(); | ||
192 | + struct LoadThreadData *data; | ||
193 | + | ||
194 | + /* We only set it from this thread so it's okay to read it directly */ | ||
195 | + assert(!mis->load_threads_abort); | ||
196 | + | ||
197 | + data = g_new(struct LoadThreadData, 1); | ||
198 | + data->function = function; | ||
199 | + data->opaque = opaque; | ||
200 | + | ||
201 | + thread_pool_submit_immediate(mis->load_threads, qemu_loadvm_load_thread, | ||
202 | + data, g_free); | ||
203 | +} | ||
204 | + | ||
205 | +void qemu_loadvm_state_cleanup(MigrationIncomingState *mis) | ||
206 | { | ||
207 | SaveStateEntry *se; | ||
208 | |||
209 | trace_loadvm_state_cleanup(); | ||
210 | + | ||
211 | QTAILQ_FOREACH(se, &savevm_state.handlers, entry) { | ||
212 | if (se->ops && se->ops->load_cleanup) { | ||
213 | se->ops->load_cleanup(se->opaque); | ||
214 | } | ||
215 | } | ||
216 | + | ||
217 | + qemu_loadvm_thread_pool_destroy(mis); | ||
218 | } | ||
219 | |||
220 | /* Return true if we should continue the migration, or false. */ | ||
221 | @@ -XXX,XX +XXX,XX @@ out: | ||
222 | |||
223 | int qemu_loadvm_state(QEMUFile *f) | ||
224 | { | ||
225 | + MigrationState *s = migrate_get_current(); | ||
226 | MigrationIncomingState *mis = migration_incoming_get_current(); | ||
227 | Error *local_err = NULL; | ||
228 | int ret; | ||
229 | @@ -XXX,XX +XXX,XX @@ int qemu_loadvm_state(QEMUFile *f) | ||
230 | return -EINVAL; | ||
231 | } | ||
232 | |||
233 | + qemu_loadvm_thread_pool_create(mis); | ||
234 | + | ||
235 | ret = qemu_loadvm_state_header(f); | ||
236 | if (ret) { | ||
237 | return ret; | ||
238 | @@ -XXX,XX +XXX,XX @@ int qemu_loadvm_state(QEMUFile *f) | ||
239 | |||
240 | /* When reaching here, it must be precopy */ | ||
241 | if (ret == 0) { | ||
242 | - if (migrate_has_error(migrate_get_current())) { | ||
243 | + if (migrate_has_error(migrate_get_current()) || | ||
244 | + !qemu_loadvm_thread_pool_wait(s, mis)) { | ||
245 | ret = -EINVAL; | ||
246 | } else { | ||
247 | ret = qemu_file_get_error(f); | ||
248 | } | ||
249 | } | ||
250 | + /* | ||
251 | + * Set this flag unconditionally so we'll catch further attempts to | ||
252 | + * start additional threads via an appropriate assert() | ||
253 | + */ | ||
254 | + qatomic_set(&mis->load_threads_abort, true); | ||
255 | |||
256 | /* | ||
257 | * Try to read in the VMDESC section as well, so that dumping tools that | ||
258 | -- | ||
259 | 2.48.1 | ||
260 | |||
261 | diff view generated by jsdifflib |
Deleted patch | |||
---|---|---|---|
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | ||
2 | 1 | ||
3 | Read packet header first so in the future we will be able to | ||
4 | differentiate between a RAM multifd packet and a device state multifd | ||
5 | packet. | ||
6 | |||
7 | Since these two are of different size we can't read the packet body until | ||
8 | we know which packet type it is. | ||
9 | |||
10 | Reviewed-by: Fabiano Rosas <farosas@suse.de> | ||
11 | Reviewed-by: Peter Xu <peterx@redhat.com> | ||
12 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | ||
13 | Link: https://lore.kernel.org/qemu-devel/832ad055fe447561ac1ad565d61658660cb3f63f.1741124640.git.maciej.szmigiero@oracle.com | ||
14 | Signed-off-by: Cédric Le Goater <clg@redhat.com> | ||
15 | --- | ||
16 | migration/multifd.h | 5 +++++ | ||
17 | migration/multifd.c | 55 ++++++++++++++++++++++++++++++++++++--------- | ||
18 | 2 files changed, 49 insertions(+), 11 deletions(-) | ||
19 | |||
20 | diff --git a/migration/multifd.h b/migration/multifd.h | ||
21 | index XXXXXXX..XXXXXXX 100644 | ||
22 | --- a/migration/multifd.h | ||
23 | +++ b/migration/multifd.h | ||
24 | @@ -XXX,XX +XXX,XX @@ typedef struct { | ||
25 | uint32_t magic; | ||
26 | uint32_t version; | ||
27 | uint32_t flags; | ||
28 | +} __attribute__((packed)) MultiFDPacketHdr_t; | ||
29 | + | ||
30 | +typedef struct { | ||
31 | + MultiFDPacketHdr_t hdr; | ||
32 | + | ||
33 | /* maximum number of allocated pages */ | ||
34 | uint32_t pages_alloc; | ||
35 | /* non zero pages */ | ||
36 | diff --git a/migration/multifd.c b/migration/multifd.c | ||
37 | index XXXXXXX..XXXXXXX 100644 | ||
38 | --- a/migration/multifd.c | ||
39 | +++ b/migration/multifd.c | ||
40 | @@ -XXX,XX +XXX,XX @@ void multifd_send_fill_packet(MultiFDSendParams *p) | ||
41 | |||
42 | memset(packet, 0, p->packet_len); | ||
43 | |||
44 | - packet->magic = cpu_to_be32(MULTIFD_MAGIC); | ||
45 | - packet->version = cpu_to_be32(MULTIFD_VERSION); | ||
46 | + packet->hdr.magic = cpu_to_be32(MULTIFD_MAGIC); | ||
47 | + packet->hdr.version = cpu_to_be32(MULTIFD_VERSION); | ||
48 | |||
49 | - packet->flags = cpu_to_be32(p->flags); | ||
50 | + packet->hdr.flags = cpu_to_be32(p->flags); | ||
51 | packet->next_packet_size = cpu_to_be32(p->next_packet_size); | ||
52 | |||
53 | packet_num = qatomic_fetch_inc(&multifd_send_state->packet_num); | ||
54 | @@ -XXX,XX +XXX,XX @@ void multifd_send_fill_packet(MultiFDSendParams *p) | ||
55 | p->flags, p->next_packet_size); | ||
56 | } | ||
57 | |||
58 | -static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp) | ||
59 | +static int multifd_recv_unfill_packet_header(MultiFDRecvParams *p, | ||
60 | + const MultiFDPacketHdr_t *hdr, | ||
61 | + Error **errp) | ||
62 | { | ||
63 | - const MultiFDPacket_t *packet = p->packet; | ||
64 | - uint32_t magic = be32_to_cpu(packet->magic); | ||
65 | - uint32_t version = be32_to_cpu(packet->version); | ||
66 | - int ret = 0; | ||
67 | + uint32_t magic = be32_to_cpu(hdr->magic); | ||
68 | + uint32_t version = be32_to_cpu(hdr->version); | ||
69 | |||
70 | if (magic != MULTIFD_MAGIC) { | ||
71 | error_setg(errp, "multifd: received packet magic %x, expected %x", | ||
72 | @@ -XXX,XX +XXX,XX @@ static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp) | ||
73 | return -1; | ||
74 | } | ||
75 | |||
76 | - p->flags = be32_to_cpu(packet->flags); | ||
77 | + p->flags = be32_to_cpu(hdr->flags); | ||
78 | + | ||
79 | + return 0; | ||
80 | +} | ||
81 | + | ||
82 | +static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp) | ||
83 | +{ | ||
84 | + const MultiFDPacket_t *packet = p->packet; | ||
85 | + int ret = 0; | ||
86 | + | ||
87 | p->next_packet_size = be32_to_cpu(packet->next_packet_size); | ||
88 | p->packet_num = be64_to_cpu(packet->packet_num); | ||
89 | p->packets_recved++; | ||
90 | @@ -XXX,XX +XXX,XX @@ static void *multifd_recv_thread(void *opaque) | ||
91 | } | ||
92 | |||
93 | while (true) { | ||
94 | + MultiFDPacketHdr_t hdr; | ||
95 | uint32_t flags = 0; | ||
96 | bool has_data = false; | ||
97 | + uint8_t *pkt_buf; | ||
98 | + size_t pkt_len; | ||
99 | + | ||
100 | p->normal_num = 0; | ||
101 | |||
102 | if (use_packets) { | ||
103 | struct iovec iov = { | ||
104 | - .iov_base = (void *)p->packet, | ||
105 | - .iov_len = p->packet_len | ||
106 | + .iov_base = (void *)&hdr, | ||
107 | + .iov_len = sizeof(hdr) | ||
108 | }; | ||
109 | |||
110 | if (multifd_recv_should_exit()) { | ||
111 | @@ -XXX,XX +XXX,XX @@ static void *multifd_recv_thread(void *opaque) | ||
112 | break; | ||
113 | } | ||
114 | |||
115 | + ret = multifd_recv_unfill_packet_header(p, &hdr, &local_err); | ||
116 | + if (ret) { | ||
117 | + break; | ||
118 | + } | ||
119 | + | ||
120 | + pkt_buf = (uint8_t *)p->packet + sizeof(hdr); | ||
121 | + pkt_len = p->packet_len - sizeof(hdr); | ||
122 | + | ||
123 | + ret = qio_channel_read_all_eof(p->c, (char *)pkt_buf, pkt_len, | ||
124 | + &local_err); | ||
125 | + if (!ret) { | ||
126 | + /* EOF */ | ||
127 | + error_setg(&local_err, "multifd: unexpected EOF after packet header"); | ||
128 | + break; | ||
129 | + } | ||
130 | + | ||
131 | + if (ret == -1) { | ||
132 | + break; | ||
133 | + } | ||
134 | + | ||
135 | qemu_mutex_lock(&p->mutex); | ||
136 | ret = multifd_recv_unfill_packet(p, &local_err); | ||
137 | if (ret) { | ||
138 | -- | ||
139 | 2.48.1 | ||
140 | |||
141 | diff view generated by jsdifflib |
Deleted patch | |||
---|---|---|---|
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | ||
2 | 1 | ||
3 | Add a basic support for receiving device state via multifd channels - | ||
4 | channels that are shared with RAM transfers. | ||
5 | |||
6 | Depending whether MULTIFD_FLAG_DEVICE_STATE flag is present or not in the | ||
7 | packet header either device state (MultiFDPacketDeviceState_t) or RAM | ||
8 | data (existing MultiFDPacket_t) is read. | ||
9 | |||
10 | The received device state data is provided to | ||
11 | qemu_loadvm_load_state_buffer() function for processing in the | ||
12 | device's load_state_buffer handler. | ||
13 | |||
14 | Reviewed-by: Peter Xu <peterx@redhat.com> | ||
15 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | ||
16 | Link: https://lore.kernel.org/qemu-devel/9b86f806c134e7815ecce0eee84f0e0e34aa0146.1741124640.git.maciej.szmigiero@oracle.com | ||
17 | Signed-off-by: Cédric Le Goater <clg@redhat.com> | ||
18 | --- | ||
19 | migration/multifd.h | 19 ++++++++- | ||
20 | migration/multifd.c | 101 +++++++++++++++++++++++++++++++++++++++----- | ||
21 | 2 files changed, 108 insertions(+), 12 deletions(-) | ||
22 | |||
23 | diff --git a/migration/multifd.h b/migration/multifd.h | ||
24 | index XXXXXXX..XXXXXXX 100644 | ||
25 | --- a/migration/multifd.h | ||
26 | +++ b/migration/multifd.h | ||
27 | @@ -XXX,XX +XXX,XX @@ MultiFDRecvData *multifd_get_recv_data(void); | ||
28 | #define MULTIFD_FLAG_UADK (8 << 1) | ||
29 | #define MULTIFD_FLAG_QATZIP (16 << 1) | ||
30 | |||
31 | +/* | ||
32 | + * If set it means that this packet contains device state | ||
33 | + * (MultiFDPacketDeviceState_t), not RAM data (MultiFDPacket_t). | ||
34 | + */ | ||
35 | +#define MULTIFD_FLAG_DEVICE_STATE (32 << 1) | ||
36 | + | ||
37 | /* This value needs to be a multiple of qemu_target_page_size() */ | ||
38 | #define MULTIFD_PACKET_SIZE (512 * 1024) | ||
39 | |||
40 | @@ -XXX,XX +XXX,XX @@ typedef struct { | ||
41 | uint64_t offset[]; | ||
42 | } __attribute__((packed)) MultiFDPacket_t; | ||
43 | |||
44 | +typedef struct { | ||
45 | + MultiFDPacketHdr_t hdr; | ||
46 | + | ||
47 | + char idstr[256]; | ||
48 | + uint32_t instance_id; | ||
49 | + | ||
50 | + /* size of the next packet that contains the actual data */ | ||
51 | + uint32_t next_packet_size; | ||
52 | +} __attribute__((packed)) MultiFDPacketDeviceState_t; | ||
53 | + | ||
54 | typedef struct { | ||
55 | /* number of used pages */ | ||
56 | uint32_t num; | ||
57 | @@ -XXX,XX +XXX,XX @@ typedef struct { | ||
58 | |||
59 | /* thread local variables. No locking required */ | ||
60 | |||
61 | - /* pointer to the packet */ | ||
62 | + /* pointers to the possible packet types */ | ||
63 | MultiFDPacket_t *packet; | ||
64 | + MultiFDPacketDeviceState_t *packet_dev_state; | ||
65 | /* size of the next packet that contains pages */ | ||
66 | uint32_t next_packet_size; | ||
67 | /* packets received through this channel */ | ||
68 | diff --git a/migration/multifd.c b/migration/multifd.c | ||
69 | index XXXXXXX..XXXXXXX 100644 | ||
70 | --- a/migration/multifd.c | ||
71 | +++ b/migration/multifd.c | ||
72 | @@ -XXX,XX +XXX,XX @@ | ||
73 | #include "file.h" | ||
74 | #include "migration.h" | ||
75 | #include "migration-stats.h" | ||
76 | +#include "savevm.h" | ||
77 | #include "socket.h" | ||
78 | #include "tls.h" | ||
79 | #include "qemu-file.h" | ||
80 | @@ -XXX,XX +XXX,XX @@ static int multifd_recv_unfill_packet_header(MultiFDRecvParams *p, | ||
81 | return 0; | ||
82 | } | ||
83 | |||
84 | -static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp) | ||
85 | +static int multifd_recv_unfill_packet_device_state(MultiFDRecvParams *p, | ||
86 | + Error **errp) | ||
87 | +{ | ||
88 | + MultiFDPacketDeviceState_t *packet = p->packet_dev_state; | ||
89 | + | ||
90 | + packet->instance_id = be32_to_cpu(packet->instance_id); | ||
91 | + p->next_packet_size = be32_to_cpu(packet->next_packet_size); | ||
92 | + | ||
93 | + return 0; | ||
94 | +} | ||
95 | + | ||
96 | +static int multifd_recv_unfill_packet_ram(MultiFDRecvParams *p, Error **errp) | ||
97 | { | ||
98 | const MultiFDPacket_t *packet = p->packet; | ||
99 | int ret = 0; | ||
100 | |||
101 | p->next_packet_size = be32_to_cpu(packet->next_packet_size); | ||
102 | p->packet_num = be64_to_cpu(packet->packet_num); | ||
103 | - p->packets_recved++; | ||
104 | |||
105 | /* Always unfill, old QEMUs (<9.0) send data along with SYNC */ | ||
106 | ret = multifd_ram_unfill_packet(p, errp); | ||
107 | @@ -XXX,XX +XXX,XX @@ static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp) | ||
108 | return ret; | ||
109 | } | ||
110 | |||
111 | +static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp) | ||
112 | +{ | ||
113 | + p->packets_recved++; | ||
114 | + | ||
115 | + if (p->flags & MULTIFD_FLAG_DEVICE_STATE) { | ||
116 | + return multifd_recv_unfill_packet_device_state(p, errp); | ||
117 | + } | ||
118 | + | ||
119 | + return multifd_recv_unfill_packet_ram(p, errp); | ||
120 | +} | ||
121 | + | ||
122 | static bool multifd_send_should_exit(void) | ||
123 | { | ||
124 | return qatomic_read(&multifd_send_state->exiting); | ||
125 | @@ -XXX,XX +XXX,XX @@ static void multifd_recv_cleanup_channel(MultiFDRecvParams *p) | ||
126 | p->packet_len = 0; | ||
127 | g_free(p->packet); | ||
128 | p->packet = NULL; | ||
129 | + g_clear_pointer(&p->packet_dev_state, g_free); | ||
130 | g_free(p->normal); | ||
131 | p->normal = NULL; | ||
132 | g_free(p->zero); | ||
133 | @@ -XXX,XX +XXX,XX @@ void multifd_recv_sync_main(void) | ||
134 | trace_multifd_recv_sync_main(multifd_recv_state->packet_num); | ||
135 | } | ||
136 | |||
137 | +static int multifd_device_state_recv(MultiFDRecvParams *p, Error **errp) | ||
138 | +{ | ||
139 | + g_autofree char *dev_state_buf = NULL; | ||
140 | + int ret; | ||
141 | + | ||
142 | + dev_state_buf = g_malloc(p->next_packet_size); | ||
143 | + | ||
144 | + ret = qio_channel_read_all(p->c, dev_state_buf, p->next_packet_size, errp); | ||
145 | + if (ret != 0) { | ||
146 | + return ret; | ||
147 | + } | ||
148 | + | ||
149 | + if (p->packet_dev_state->idstr[sizeof(p->packet_dev_state->idstr) - 1] | ||
150 | + != 0) { | ||
151 | + error_setg(errp, "unterminated multifd device state idstr"); | ||
152 | + return -1; | ||
153 | + } | ||
154 | + | ||
155 | + if (!qemu_loadvm_load_state_buffer(p->packet_dev_state->idstr, | ||
156 | + p->packet_dev_state->instance_id, | ||
157 | + dev_state_buf, p->next_packet_size, | ||
158 | + errp)) { | ||
159 | + ret = -1; | ||
160 | + } | ||
161 | + | ||
162 | + return ret; | ||
163 | +} | ||
164 | + | ||
165 | static void *multifd_recv_thread(void *opaque) | ||
166 | { | ||
167 | MigrationState *s = migrate_get_current(); | ||
168 | @@ -XXX,XX +XXX,XX @@ static void *multifd_recv_thread(void *opaque) | ||
169 | while (true) { | ||
170 | MultiFDPacketHdr_t hdr; | ||
171 | uint32_t flags = 0; | ||
172 | + bool is_device_state = false; | ||
173 | bool has_data = false; | ||
174 | uint8_t *pkt_buf; | ||
175 | size_t pkt_len; | ||
176 | @@ -XXX,XX +XXX,XX @@ static void *multifd_recv_thread(void *opaque) | ||
177 | break; | ||
178 | } | ||
179 | |||
180 | - pkt_buf = (uint8_t *)p->packet + sizeof(hdr); | ||
181 | - pkt_len = p->packet_len - sizeof(hdr); | ||
182 | + is_device_state = p->flags & MULTIFD_FLAG_DEVICE_STATE; | ||
183 | + if (is_device_state) { | ||
184 | + pkt_buf = (uint8_t *)p->packet_dev_state + sizeof(hdr); | ||
185 | + pkt_len = sizeof(*p->packet_dev_state) - sizeof(hdr); | ||
186 | + } else { | ||
187 | + pkt_buf = (uint8_t *)p->packet + sizeof(hdr); | ||
188 | + pkt_len = p->packet_len - sizeof(hdr); | ||
189 | + } | ||
190 | |||
191 | ret = qio_channel_read_all_eof(p->c, (char *)pkt_buf, pkt_len, | ||
192 | &local_err); | ||
193 | @@ -XXX,XX +XXX,XX @@ static void *multifd_recv_thread(void *opaque) | ||
194 | /* recv methods don't know how to handle the SYNC flag */ | ||
195 | p->flags &= ~MULTIFD_FLAG_SYNC; | ||
196 | |||
197 | - /* | ||
198 | - * Even if it's a SYNC packet, this needs to be set | ||
199 | - * because older QEMUs (<9.0) still send data along with | ||
200 | - * the SYNC packet. | ||
201 | - */ | ||
202 | - has_data = p->normal_num || p->zero_num; | ||
203 | + if (is_device_state) { | ||
204 | + has_data = p->next_packet_size > 0; | ||
205 | + } else { | ||
206 | + /* | ||
207 | + * Even if it's a SYNC packet, this needs to be set | ||
208 | + * because older QEMUs (<9.0) still send data along with | ||
209 | + * the SYNC packet. | ||
210 | + */ | ||
211 | + has_data = p->normal_num || p->zero_num; | ||
212 | + } | ||
213 | + | ||
214 | qemu_mutex_unlock(&p->mutex); | ||
215 | } else { | ||
216 | /* | ||
217 | @@ -XXX,XX +XXX,XX @@ static void *multifd_recv_thread(void *opaque) | ||
218 | } | ||
219 | |||
220 | if (has_data) { | ||
221 | - ret = multifd_recv_state->ops->recv(p, &local_err); | ||
222 | + if (is_device_state) { | ||
223 | + assert(use_packets); | ||
224 | + ret = multifd_device_state_recv(p, &local_err); | ||
225 | + } else { | ||
226 | + ret = multifd_recv_state->ops->recv(p, &local_err); | ||
227 | + } | ||
228 | if (ret != 0) { | ||
229 | break; | ||
230 | } | ||
231 | + } else if (is_device_state) { | ||
232 | + error_setg(&local_err, | ||
233 | + "multifd: received empty device state packet"); | ||
234 | + break; | ||
235 | } | ||
236 | |||
237 | if (use_packets) { | ||
238 | if (flags & MULTIFD_FLAG_SYNC) { | ||
239 | + if (is_device_state) { | ||
240 | + error_setg(&local_err, | ||
241 | + "multifd: received SYNC device state packet"); | ||
242 | + break; | ||
243 | + } | ||
244 | + | ||
245 | qemu_sem_post(&multifd_recv_state->sem_sync); | ||
246 | qemu_sem_wait(&p->sem_sync); | ||
247 | } | ||
248 | @@ -XXX,XX +XXX,XX @@ int multifd_recv_setup(Error **errp) | ||
249 | p->packet_len = sizeof(MultiFDPacket_t) | ||
250 | + sizeof(uint64_t) * page_count; | ||
251 | p->packet = g_malloc0(p->packet_len); | ||
252 | + p->packet_dev_state = g_malloc0(sizeof(*p->packet_dev_state)); | ||
253 | } | ||
254 | p->name = g_strdup_printf(MIGRATION_THREAD_DST_MULTIFD, i); | ||
255 | p->normal = g_new0(ram_addr_t, page_count); | ||
256 | -- | ||
257 | 2.48.1 | ||
258 | |||
259 | diff view generated by jsdifflib |
Deleted patch | |||
---|---|---|---|
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | ||
2 | 1 | ||
3 | multifd_send() function is currently not thread safe, make it thread safe | ||
4 | by holding a lock during its execution. | ||
5 | |||
6 | This way it will be possible to safely call it concurrently from multiple | ||
7 | threads. | ||
8 | |||
9 | Reviewed-by: Peter Xu <peterx@redhat.com> | ||
10 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | ||
11 | Link: https://lore.kernel.org/qemu-devel/dd0f3bcc02ca96a7d523ca58ea69e495a33b453b.1741124640.git.maciej.szmigiero@oracle.com | ||
12 | Signed-off-by: Cédric Le Goater <clg@redhat.com> | ||
13 | --- | ||
14 | migration/multifd.c | 8 ++++++++ | ||
15 | 1 file changed, 8 insertions(+) | ||
16 | |||
17 | diff --git a/migration/multifd.c b/migration/multifd.c | ||
18 | index XXXXXXX..XXXXXXX 100644 | ||
19 | --- a/migration/multifd.c | ||
20 | +++ b/migration/multifd.c | ||
21 | @@ -XXX,XX +XXX,XX @@ typedef struct { | ||
22 | |||
23 | struct { | ||
24 | MultiFDSendParams *params; | ||
25 | + | ||
26 | + /* multifd_send() body is not thread safe, needs serialization */ | ||
27 | + QemuMutex multifd_send_mutex; | ||
28 | + | ||
29 | /* | ||
30 | * Global number of generated multifd packets. | ||
31 | * | ||
32 | @@ -XXX,XX +XXX,XX @@ bool multifd_send(MultiFDSendData **send_data) | ||
33 | return false; | ||
34 | } | ||
35 | |||
36 | + QEMU_LOCK_GUARD(&multifd_send_state->multifd_send_mutex); | ||
37 | + | ||
38 | /* We wait here, until at least one channel is ready */ | ||
39 | qemu_sem_wait(&multifd_send_state->channels_ready); | ||
40 | |||
41 | @@ -XXX,XX +XXX,XX @@ static void multifd_send_cleanup_state(void) | ||
42 | socket_cleanup_outgoing_migration(); | ||
43 | qemu_sem_destroy(&multifd_send_state->channels_created); | ||
44 | qemu_sem_destroy(&multifd_send_state->channels_ready); | ||
45 | + qemu_mutex_destroy(&multifd_send_state->multifd_send_mutex); | ||
46 | g_free(multifd_send_state->params); | ||
47 | multifd_send_state->params = NULL; | ||
48 | g_free(multifd_send_state); | ||
49 | @@ -XXX,XX +XXX,XX @@ bool multifd_send_setup(void) | ||
50 | thread_count = migrate_multifd_channels(); | ||
51 | multifd_send_state = g_malloc0(sizeof(*multifd_send_state)); | ||
52 | multifd_send_state->params = g_new0(MultiFDSendParams, thread_count); | ||
53 | + qemu_mutex_init(&multifd_send_state->multifd_send_mutex); | ||
54 | qemu_sem_init(&multifd_send_state->channels_created, 0); | ||
55 | qemu_sem_init(&multifd_send_state->channels_ready, 0); | ||
56 | qatomic_set(&multifd_send_state->exiting, 0); | ||
57 | -- | ||
58 | 2.48.1 | ||
59 | |||
60 | diff view generated by jsdifflib |
Deleted patch | |||
---|---|---|---|
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | ||
2 | 1 | ||
3 | This way if there are fields there that needs explicit disposal (like, for | ||
4 | example, some attached buffers) they will be handled appropriately. | ||
5 | |||
6 | Add a related assert to multifd_set_payload_type() in order to make sure | ||
7 | that this function is only used to fill a previously empty MultiFDSendData | ||
8 | with some payload, not the other way around. | ||
9 | |||
10 | Reviewed-by: Fabiano Rosas <farosas@suse.de> | ||
11 | Reviewed-by: Peter Xu <peterx@redhat.com> | ||
12 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | ||
13 | Link: https://lore.kernel.org/qemu-devel/6755205f2b95abbed251f87061feee1c0e410836.1741124640.git.maciej.szmigiero@oracle.com | ||
14 | Signed-off-by: Cédric Le Goater <clg@redhat.com> | ||
15 | --- | ||
16 | migration/multifd.h | 5 +++++ | ||
17 | migration/multifd-nocomp.c | 3 +-- | ||
18 | migration/multifd.c | 31 ++++++++++++++++++++++++++++--- | ||
19 | 3 files changed, 34 insertions(+), 5 deletions(-) | ||
20 | |||
21 | diff --git a/migration/multifd.h b/migration/multifd.h | ||
22 | index XXXXXXX..XXXXXXX 100644 | ||
23 | --- a/migration/multifd.h | ||
24 | +++ b/migration/multifd.h | ||
25 | @@ -XXX,XX +XXX,XX @@ static inline bool multifd_payload_empty(MultiFDSendData *data) | ||
26 | static inline void multifd_set_payload_type(MultiFDSendData *data, | ||
27 | MultiFDPayloadType type) | ||
28 | { | ||
29 | + assert(multifd_payload_empty(data)); | ||
30 | + assert(type != MULTIFD_PAYLOAD_NONE); | ||
31 | + | ||
32 | data->type = type; | ||
33 | } | ||
34 | |||
35 | @@ -XXX,XX +XXX,XX @@ static inline void multifd_send_prepare_header(MultiFDSendParams *p) | ||
36 | void multifd_channel_connect(MultiFDSendParams *p, QIOChannel *ioc); | ||
37 | bool multifd_send(MultiFDSendData **send_data); | ||
38 | MultiFDSendData *multifd_send_data_alloc(void); | ||
39 | +void multifd_send_data_clear(MultiFDSendData *data); | ||
40 | +void multifd_send_data_free(MultiFDSendData *data); | ||
41 | |||
42 | static inline uint32_t multifd_ram_page_size(void) | ||
43 | { | ||
44 | diff --git a/migration/multifd-nocomp.c b/migration/multifd-nocomp.c | ||
45 | index XXXXXXX..XXXXXXX 100644 | ||
46 | --- a/migration/multifd-nocomp.c | ||
47 | +++ b/migration/multifd-nocomp.c | ||
48 | @@ -XXX,XX +XXX,XX @@ void multifd_ram_save_setup(void) | ||
49 | |||
50 | void multifd_ram_save_cleanup(void) | ||
51 | { | ||
52 | - g_free(multifd_ram_send); | ||
53 | - multifd_ram_send = NULL; | ||
54 | + g_clear_pointer(&multifd_ram_send, multifd_send_data_free); | ||
55 | } | ||
56 | |||
57 | static void multifd_set_file_bitmap(MultiFDSendParams *p) | ||
58 | diff --git a/migration/multifd.c b/migration/multifd.c | ||
59 | index XXXXXXX..XXXXXXX 100644 | ||
60 | --- a/migration/multifd.c | ||
61 | +++ b/migration/multifd.c | ||
62 | @@ -XXX,XX +XXX,XX @@ MultiFDSendData *multifd_send_data_alloc(void) | ||
63 | return g_malloc0(size_minus_payload + max_payload_size); | ||
64 | } | ||
65 | |||
66 | +void multifd_send_data_clear(MultiFDSendData *data) | ||
67 | +{ | ||
68 | + if (multifd_payload_empty(data)) { | ||
69 | + return; | ||
70 | + } | ||
71 | + | ||
72 | + switch (data->type) { | ||
73 | + default: | ||
74 | + /* Nothing to do */ | ||
75 | + break; | ||
76 | + } | ||
77 | + | ||
78 | + data->type = MULTIFD_PAYLOAD_NONE; | ||
79 | +} | ||
80 | + | ||
81 | +void multifd_send_data_free(MultiFDSendData *data) | ||
82 | +{ | ||
83 | + if (!data) { | ||
84 | + return; | ||
85 | + } | ||
86 | + | ||
87 | + multifd_send_data_clear(data); | ||
88 | + | ||
89 | + g_free(data); | ||
90 | +} | ||
91 | + | ||
92 | static bool multifd_use_packets(void) | ||
93 | { | ||
94 | return !migrate_mapped_ram(); | ||
95 | @@ -XXX,XX +XXX,XX @@ static bool multifd_send_cleanup_channel(MultiFDSendParams *p, Error **errp) | ||
96 | qemu_sem_destroy(&p->sem_sync); | ||
97 | g_free(p->name); | ||
98 | p->name = NULL; | ||
99 | - g_free(p->data); | ||
100 | - p->data = NULL; | ||
101 | + g_clear_pointer(&p->data, multifd_send_data_free); | ||
102 | p->packet_len = 0; | ||
103 | g_free(p->packet); | ||
104 | p->packet = NULL; | ||
105 | @@ -XXX,XX +XXX,XX @@ static void *multifd_send_thread(void *opaque) | ||
106 | (uint64_t)p->next_packet_size + p->packet_len); | ||
107 | |||
108 | p->next_packet_size = 0; | ||
109 | - multifd_set_payload_type(p->data, MULTIFD_PAYLOAD_NONE); | ||
110 | + multifd_send_data_clear(p->data); | ||
111 | |||
112 | /* | ||
113 | * Making sure p->data is published before saying "we're | ||
114 | -- | ||
115 | 2.48.1 | ||
116 | |||
117 | diff view generated by jsdifflib |
Deleted patch | |||
---|---|---|---|
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | ||
2 | 1 | ||
3 | A new function multifd_queue_device_state() is provided for device to queue | ||
4 | its state for transmission via a multifd channel. | ||
5 | |||
6 | Reviewed-by: Peter Xu <peterx@redhat.com> | ||
7 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | ||
8 | Link: https://lore.kernel.org/qemu-devel/ebd55768d3e5fecb5eb3f197bad9c0c07e5bc084.1741124640.git.maciej.szmigiero@oracle.com | ||
9 | Signed-off-by: Cédric Le Goater <clg@redhat.com> | ||
10 | --- | ||
11 | include/migration/misc.h | 4 ++ | ||
12 | migration/multifd.h | 34 ++++++--- | ||
13 | migration/multifd-device-state.c | 118 +++++++++++++++++++++++++++++++ | ||
14 | migration/multifd-nocomp.c | 14 +++- | ||
15 | migration/multifd.c | 42 +++++++++-- | ||
16 | migration/meson.build | 1 + | ||
17 | 6 files changed, 197 insertions(+), 16 deletions(-) | ||
18 | create mode 100644 migration/multifd-device-state.c | ||
19 | |||
20 | diff --git a/include/migration/misc.h b/include/migration/misc.h | ||
21 | index XXXXXXX..XXXXXXX 100644 | ||
22 | --- a/include/migration/misc.h | ||
23 | +++ b/include/migration/misc.h | ||
24 | @@ -XXX,XX +XXX,XX @@ bool migrate_is_uri(const char *uri); | ||
25 | bool migrate_uri_parse(const char *uri, MigrationChannel **channel, | ||
26 | Error **errp); | ||
27 | |||
28 | +/* migration/multifd-device-state.c */ | ||
29 | +bool multifd_queue_device_state(char *idstr, uint32_t instance_id, | ||
30 | + char *data, size_t len); | ||
31 | + | ||
32 | #endif | ||
33 | diff --git a/migration/multifd.h b/migration/multifd.h | ||
34 | index XXXXXXX..XXXXXXX 100644 | ||
35 | --- a/migration/multifd.h | ||
36 | +++ b/migration/multifd.h | ||
37 | @@ -XXX,XX +XXX,XX @@ struct MultiFDRecvData { | ||
38 | off_t file_offset; | ||
39 | }; | ||
40 | |||
41 | +typedef struct { | ||
42 | + char *idstr; | ||
43 | + uint32_t instance_id; | ||
44 | + char *buf; | ||
45 | + size_t buf_len; | ||
46 | +} MultiFDDeviceState_t; | ||
47 | + | ||
48 | typedef enum { | ||
49 | MULTIFD_PAYLOAD_NONE, | ||
50 | MULTIFD_PAYLOAD_RAM, | ||
51 | + MULTIFD_PAYLOAD_DEVICE_STATE, | ||
52 | } MultiFDPayloadType; | ||
53 | |||
54 | typedef union MultiFDPayload { | ||
55 | MultiFDPages_t ram; | ||
56 | + MultiFDDeviceState_t device_state; | ||
57 | } MultiFDPayload; | ||
58 | |||
59 | struct MultiFDSendData { | ||
60 | @@ -XXX,XX +XXX,XX @@ static inline bool multifd_payload_empty(MultiFDSendData *data) | ||
61 | return data->type == MULTIFD_PAYLOAD_NONE; | ||
62 | } | ||
63 | |||
64 | +static inline bool multifd_payload_device_state(MultiFDSendData *data) | ||
65 | +{ | ||
66 | + return data->type == MULTIFD_PAYLOAD_DEVICE_STATE; | ||
67 | +} | ||
68 | + | ||
69 | static inline void multifd_set_payload_type(MultiFDSendData *data, | ||
70 | MultiFDPayloadType type) | ||
71 | { | ||
72 | @@ -XXX,XX +XXX,XX @@ typedef struct { | ||
73 | |||
74 | /* thread local variables. No locking required */ | ||
75 | |||
76 | - /* pointer to the packet */ | ||
77 | + /* pointers to the possible packet types */ | ||
78 | MultiFDPacket_t *packet; | ||
79 | + MultiFDPacketDeviceState_t *packet_device_state; | ||
80 | /* size of the next packet that contains pages */ | ||
81 | uint32_t next_packet_size; | ||
82 | /* packets sent through this channel */ | ||
83 | @@ -XXX,XX +XXX,XX @@ bool multifd_send_prepare_common(MultiFDSendParams *p); | ||
84 | void multifd_send_zero_page_detect(MultiFDSendParams *p); | ||
85 | void multifd_recv_zero_page_process(MultiFDRecvParams *p); | ||
86 | |||
87 | -static inline void multifd_send_prepare_header(MultiFDSendParams *p) | ||
88 | -{ | ||
89 | - p->iov[0].iov_len = p->packet_len; | ||
90 | - p->iov[0].iov_base = p->packet; | ||
91 | - p->iovs_num++; | ||
92 | -} | ||
93 | - | ||
94 | void multifd_channel_connect(MultiFDSendParams *p, QIOChannel *ioc); | ||
95 | bool multifd_send(MultiFDSendData **send_data); | ||
96 | MultiFDSendData *multifd_send_data_alloc(void); | ||
97 | @@ -XXX,XX +XXX,XX @@ bool multifd_ram_sync_per_section(void); | ||
98 | size_t multifd_ram_payload_size(void); | ||
99 | void multifd_ram_fill_packet(MultiFDSendParams *p); | ||
100 | int multifd_ram_unfill_packet(MultiFDRecvParams *p, Error **errp); | ||
101 | + | ||
102 | +size_t multifd_device_state_payload_size(void); | ||
103 | + | ||
104 | +void multifd_send_data_clear_device_state(MultiFDDeviceState_t *device_state); | ||
105 | + | ||
106 | +void multifd_device_state_send_setup(void); | ||
107 | +void multifd_device_state_send_cleanup(void); | ||
108 | + | ||
109 | +void multifd_device_state_send_prepare(MultiFDSendParams *p); | ||
110 | + | ||
111 | #endif | ||
112 | diff --git a/migration/multifd-device-state.c b/migration/multifd-device-state.c | ||
113 | new file mode 100644 | ||
114 | index XXXXXXX..XXXXXXX | ||
115 | --- /dev/null | ||
116 | +++ b/migration/multifd-device-state.c | ||
117 | @@ -XXX,XX +XXX,XX @@ | ||
118 | +/* | ||
119 | + * Multifd device state migration | ||
120 | + * | ||
121 | + * Copyright (C) 2024,2025 Oracle and/or its affiliates. | ||
122 | + * | ||
123 | + * This work is licensed under the terms of the GNU GPL, version 2 or later. | ||
124 | + * See the COPYING file in the top-level directory. | ||
125 | + * | ||
126 | + * SPDX-License-Identifier: GPL-2.0-or-later | ||
127 | + */ | ||
128 | + | ||
129 | +#include "qemu/osdep.h" | ||
130 | +#include "qemu/lockable.h" | ||
131 | +#include "migration/misc.h" | ||
132 | +#include "multifd.h" | ||
133 | + | ||
134 | +static struct { | ||
135 | + QemuMutex queue_job_mutex; | ||
136 | + | ||
137 | + MultiFDSendData *send_data; | ||
138 | +} *multifd_send_device_state; | ||
139 | + | ||
140 | +size_t multifd_device_state_payload_size(void) | ||
141 | +{ | ||
142 | + return sizeof(MultiFDDeviceState_t); | ||
143 | +} | ||
144 | + | ||
145 | +void multifd_device_state_send_setup(void) | ||
146 | +{ | ||
147 | + assert(!multifd_send_device_state); | ||
148 | + multifd_send_device_state = g_malloc(sizeof(*multifd_send_device_state)); | ||
149 | + | ||
150 | + qemu_mutex_init(&multifd_send_device_state->queue_job_mutex); | ||
151 | + | ||
152 | + multifd_send_device_state->send_data = multifd_send_data_alloc(); | ||
153 | +} | ||
154 | + | ||
155 | +void multifd_device_state_send_cleanup(void) | ||
156 | +{ | ||
157 | + g_clear_pointer(&multifd_send_device_state->send_data, | ||
158 | + multifd_send_data_free); | ||
159 | + | ||
160 | + qemu_mutex_destroy(&multifd_send_device_state->queue_job_mutex); | ||
161 | + | ||
162 | + g_clear_pointer(&multifd_send_device_state, g_free); | ||
163 | +} | ||
164 | + | ||
165 | +void multifd_send_data_clear_device_state(MultiFDDeviceState_t *device_state) | ||
166 | +{ | ||
167 | + g_clear_pointer(&device_state->idstr, g_free); | ||
168 | + g_clear_pointer(&device_state->buf, g_free); | ||
169 | +} | ||
170 | + | ||
171 | +static void multifd_device_state_fill_packet(MultiFDSendParams *p) | ||
172 | +{ | ||
173 | + MultiFDDeviceState_t *device_state = &p->data->u.device_state; | ||
174 | + MultiFDPacketDeviceState_t *packet = p->packet_device_state; | ||
175 | + | ||
176 | + packet->hdr.flags = cpu_to_be32(p->flags); | ||
177 | + strncpy(packet->idstr, device_state->idstr, sizeof(packet->idstr) - 1); | ||
178 | + packet->idstr[sizeof(packet->idstr) - 1] = 0; | ||
179 | + packet->instance_id = cpu_to_be32(device_state->instance_id); | ||
180 | + packet->next_packet_size = cpu_to_be32(p->next_packet_size); | ||
181 | +} | ||
182 | + | ||
183 | +static void multifd_prepare_header_device_state(MultiFDSendParams *p) | ||
184 | +{ | ||
185 | + p->iov[0].iov_len = sizeof(*p->packet_device_state); | ||
186 | + p->iov[0].iov_base = p->packet_device_state; | ||
187 | + p->iovs_num++; | ||
188 | +} | ||
189 | + | ||
190 | +void multifd_device_state_send_prepare(MultiFDSendParams *p) | ||
191 | +{ | ||
192 | + MultiFDDeviceState_t *device_state = &p->data->u.device_state; | ||
193 | + | ||
194 | + assert(multifd_payload_device_state(p->data)); | ||
195 | + | ||
196 | + multifd_prepare_header_device_state(p); | ||
197 | + | ||
198 | + assert(!(p->flags & MULTIFD_FLAG_SYNC)); | ||
199 | + | ||
200 | + p->next_packet_size = device_state->buf_len; | ||
201 | + if (p->next_packet_size > 0) { | ||
202 | + p->iov[p->iovs_num].iov_base = device_state->buf; | ||
203 | + p->iov[p->iovs_num].iov_len = p->next_packet_size; | ||
204 | + p->iovs_num++; | ||
205 | + } | ||
206 | + | ||
207 | + p->flags |= MULTIFD_FLAG_NOCOMP | MULTIFD_FLAG_DEVICE_STATE; | ||
208 | + | ||
209 | + multifd_device_state_fill_packet(p); | ||
210 | +} | ||
211 | + | ||
212 | +bool multifd_queue_device_state(char *idstr, uint32_t instance_id, | ||
213 | + char *data, size_t len) | ||
214 | +{ | ||
215 | + /* Device state submissions can come from multiple threads */ | ||
216 | + QEMU_LOCK_GUARD(&multifd_send_device_state->queue_job_mutex); | ||
217 | + MultiFDDeviceState_t *device_state; | ||
218 | + | ||
219 | + assert(multifd_payload_empty(multifd_send_device_state->send_data)); | ||
220 | + | ||
221 | + multifd_set_payload_type(multifd_send_device_state->send_data, | ||
222 | + MULTIFD_PAYLOAD_DEVICE_STATE); | ||
223 | + device_state = &multifd_send_device_state->send_data->u.device_state; | ||
224 | + device_state->idstr = g_strdup(idstr); | ||
225 | + device_state->instance_id = instance_id; | ||
226 | + device_state->buf = g_memdup2(data, len); | ||
227 | + device_state->buf_len = len; | ||
228 | + | ||
229 | + if (!multifd_send(&multifd_send_device_state->send_data)) { | ||
230 | + multifd_send_data_clear(multifd_send_device_state->send_data); | ||
231 | + return false; | ||
232 | + } | ||
233 | + | ||
234 | + return true; | ||
235 | +} | ||
236 | diff --git a/migration/multifd-nocomp.c b/migration/multifd-nocomp.c | ||
237 | index XXXXXXX..XXXXXXX 100644 | ||
238 | --- a/migration/multifd-nocomp.c | ||
239 | +++ b/migration/multifd-nocomp.c | ||
240 | @@ -XXX,XX +XXX,XX @@ | ||
241 | #include "exec/ramblock.h" | ||
242 | #include "exec/target_page.h" | ||
243 | #include "file.h" | ||
244 | +#include "migration-stats.h" | ||
245 | #include "multifd.h" | ||
246 | #include "options.h" | ||
247 | #include "qapi/error.h" | ||
248 | @@ -XXX,XX +XXX,XX @@ static void multifd_nocomp_send_cleanup(MultiFDSendParams *p, Error **errp) | ||
249 | return; | ||
250 | } | ||
251 | |||
252 | +static void multifd_ram_prepare_header(MultiFDSendParams *p) | ||
253 | +{ | ||
254 | + p->iov[0].iov_len = p->packet_len; | ||
255 | + p->iov[0].iov_base = p->packet; | ||
256 | + p->iovs_num++; | ||
257 | +} | ||
258 | + | ||
259 | static void multifd_send_prepare_iovs(MultiFDSendParams *p) | ||
260 | { | ||
261 | MultiFDPages_t *pages = &p->data->u.ram; | ||
262 | @@ -XXX,XX +XXX,XX @@ static int multifd_nocomp_send_prepare(MultiFDSendParams *p, Error **errp) | ||
263 | * Only !zerocopy needs the header in IOV; zerocopy will | ||
264 | * send it separately. | ||
265 | */ | ||
266 | - multifd_send_prepare_header(p); | ||
267 | + multifd_ram_prepare_header(p); | ||
268 | } | ||
269 | |||
270 | multifd_send_prepare_iovs(p); | ||
271 | @@ -XXX,XX +XXX,XX @@ static int multifd_nocomp_send_prepare(MultiFDSendParams *p, Error **errp) | ||
272 | if (ret != 0) { | ||
273 | return -1; | ||
274 | } | ||
275 | + | ||
276 | + stat64_add(&mig_stats.multifd_bytes, p->packet_len); | ||
277 | } | ||
278 | |||
279 | return 0; | ||
280 | @@ -XXX,XX +XXX,XX @@ int multifd_ram_flush_and_sync(QEMUFile *f) | ||
281 | bool multifd_send_prepare_common(MultiFDSendParams *p) | ||
282 | { | ||
283 | MultiFDPages_t *pages = &p->data->u.ram; | ||
284 | - multifd_send_prepare_header(p); | ||
285 | + multifd_ram_prepare_header(p); | ||
286 | multifd_send_zero_page_detect(p); | ||
287 | |||
288 | if (!pages->normal_num) { | ||
289 | diff --git a/migration/multifd.c b/migration/multifd.c | ||
290 | index XXXXXXX..XXXXXXX 100644 | ||
291 | --- a/migration/multifd.c | ||
292 | +++ b/migration/multifd.c | ||
293 | @@ -XXX,XX +XXX,XX @@ | ||
294 | |||
295 | #include "qemu/osdep.h" | ||
296 | #include "qemu/cutils.h" | ||
297 | +#include "qemu/iov.h" | ||
298 | #include "qemu/rcu.h" | ||
299 | #include "exec/target_page.h" | ||
300 | #include "system/system.h" | ||
301 | @@ -XXX,XX +XXX,XX @@ | ||
302 | #include "qemu/error-report.h" | ||
303 | #include "qapi/error.h" | ||
304 | #include "file.h" | ||
305 | +#include "migration/misc.h" | ||
306 | #include "migration.h" | ||
307 | #include "migration-stats.h" | ||
308 | #include "savevm.h" | ||
309 | @@ -XXX,XX +XXX,XX @@ MultiFDSendData *multifd_send_data_alloc(void) | ||
310 | * added to the union in the future are larger than | ||
311 | * (MultiFDPages_t + flex array). | ||
312 | */ | ||
313 | - max_payload_size = MAX(multifd_ram_payload_size(), sizeof(MultiFDPayload)); | ||
314 | + max_payload_size = MAX(multifd_ram_payload_size(), | ||
315 | + multifd_device_state_payload_size()); | ||
316 | + max_payload_size = MAX(max_payload_size, sizeof(MultiFDPayload)); | ||
317 | |||
318 | /* | ||
319 | * Account for any holes the compiler might insert. We can't pack | ||
320 | @@ -XXX,XX +XXX,XX @@ void multifd_send_data_clear(MultiFDSendData *data) | ||
321 | } | ||
322 | |||
323 | switch (data->type) { | ||
324 | + case MULTIFD_PAYLOAD_DEVICE_STATE: | ||
325 | + multifd_send_data_clear_device_state(&data->u.device_state); | ||
326 | + break; | ||
327 | default: | ||
328 | /* Nothing to do */ | ||
329 | break; | ||
330 | @@ -XXX,XX +XXX,XX @@ static int multifd_recv_initial_packet(QIOChannel *c, Error **errp) | ||
331 | return msg.id; | ||
332 | } | ||
333 | |||
334 | +/* Fills a RAM multifd packet */ | ||
335 | void multifd_send_fill_packet(MultiFDSendParams *p) | ||
336 | { | ||
337 | MultiFDPacket_t *packet = p->packet; | ||
338 | @@ -XXX,XX +XXX,XX @@ static bool multifd_send_cleanup_channel(MultiFDSendParams *p, Error **errp) | ||
339 | p->name = NULL; | ||
340 | g_clear_pointer(&p->data, multifd_send_data_free); | ||
341 | p->packet_len = 0; | ||
342 | + g_clear_pointer(&p->packet_device_state, g_free); | ||
343 | g_free(p->packet); | ||
344 | p->packet = NULL; | ||
345 | multifd_send_state->ops->send_cleanup(p, errp); | ||
346 | @@ -XXX,XX +XXX,XX @@ static void multifd_send_cleanup_state(void) | ||
347 | { | ||
348 | file_cleanup_outgoing_migration(); | ||
349 | socket_cleanup_outgoing_migration(); | ||
350 | + multifd_device_state_send_cleanup(); | ||
351 | qemu_sem_destroy(&multifd_send_state->channels_created); | ||
352 | qemu_sem_destroy(&multifd_send_state->channels_ready); | ||
353 | qemu_mutex_destroy(&multifd_send_state->multifd_send_mutex); | ||
354 | @@ -XXX,XX +XXX,XX @@ static void *multifd_send_thread(void *opaque) | ||
355 | * qatomic_store_release() in multifd_send(). | ||
356 | */ | ||
357 | if (qatomic_load_acquire(&p->pending_job)) { | ||
358 | + bool is_device_state = multifd_payload_device_state(p->data); | ||
359 | + size_t total_size; | ||
360 | + | ||
361 | p->flags = 0; | ||
362 | p->iovs_num = 0; | ||
363 | assert(!multifd_payload_empty(p->data)); | ||
364 | |||
365 | - ret = multifd_send_state->ops->send_prepare(p, &local_err); | ||
366 | - if (ret != 0) { | ||
367 | - break; | ||
368 | + if (is_device_state) { | ||
369 | + multifd_device_state_send_prepare(p); | ||
370 | + } else { | ||
371 | + ret = multifd_send_state->ops->send_prepare(p, &local_err); | ||
372 | + if (ret != 0) { | ||
373 | + break; | ||
374 | + } | ||
375 | } | ||
376 | |||
377 | + /* | ||
378 | + * The packet header in the zerocopy RAM case is accounted for | ||
379 | + * in multifd_nocomp_send_prepare() - where it is actually | ||
380 | + * being sent. | ||
381 | + */ | ||
382 | + total_size = iov_size(p->iov, p->iovs_num); | ||
383 | + | ||
384 | if (migrate_mapped_ram()) { | ||
385 | + assert(!is_device_state); | ||
386 | + | ||
387 | ret = file_write_ramblock_iov(p->c, p->iov, p->iovs_num, | ||
388 | &p->data->u.ram, &local_err); | ||
389 | } else { | ||
390 | @@ -XXX,XX +XXX,XX @@ static void *multifd_send_thread(void *opaque) | ||
391 | break; | ||
392 | } | ||
393 | |||
394 | - stat64_add(&mig_stats.multifd_bytes, | ||
395 | - (uint64_t)p->next_packet_size + p->packet_len); | ||
396 | + stat64_add(&mig_stats.multifd_bytes, total_size); | ||
397 | |||
398 | p->next_packet_size = 0; | ||
399 | multifd_send_data_clear(p->data); | ||
400 | @@ -XXX,XX +XXX,XX @@ bool multifd_send_setup(void) | ||
401 | p->packet_len = sizeof(MultiFDPacket_t) | ||
402 | + sizeof(uint64_t) * page_count; | ||
403 | p->packet = g_malloc0(p->packet_len); | ||
404 | + p->packet_device_state = g_malloc0(sizeof(*p->packet_device_state)); | ||
405 | + p->packet_device_state->hdr.magic = cpu_to_be32(MULTIFD_MAGIC); | ||
406 | + p->packet_device_state->hdr.version = cpu_to_be32(MULTIFD_VERSION); | ||
407 | } | ||
408 | p->name = g_strdup_printf(MIGRATION_THREAD_SRC_MULTIFD, i); | ||
409 | p->write_flags = 0; | ||
410 | @@ -XXX,XX +XXX,XX @@ bool multifd_send_setup(void) | ||
411 | assert(p->iov); | ||
412 | } | ||
413 | |||
414 | + multifd_device_state_send_setup(); | ||
415 | + | ||
416 | return true; | ||
417 | |||
418 | err: | ||
419 | diff --git a/migration/meson.build b/migration/meson.build | ||
420 | index XXXXXXX..XXXXXXX 100644 | ||
421 | --- a/migration/meson.build | ||
422 | +++ b/migration/meson.build | ||
423 | @@ -XXX,XX +XXX,XX @@ system_ss.add(files( | ||
424 | 'migration-hmp-cmds.c', | ||
425 | 'migration.c', | ||
426 | 'multifd.c', | ||
427 | + 'multifd-device-state.c', | ||
428 | 'multifd-nocomp.c', | ||
429 | 'multifd-zlib.c', | ||
430 | 'multifd-zero-page.c', | ||
431 | -- | ||
432 | 2.48.1 | ||
433 | |||
434 | diff view generated by jsdifflib |
Deleted patch | |||
---|---|---|---|
1 | From: Peter Xu <peterx@redhat.com> | ||
2 | 1 | ||
3 | The newly introduced device state buffer can be used for either storing | ||
4 | VFIO's read() raw data, but already also possible to store generic device | ||
5 | states. After noticing that device states may not easily provide a max | ||
6 | buffer size (also the fact that RAM MultiFDPages_t after all also want to | ||
7 | have flexibility on managing offset[] array), it may not be a good idea to | ||
8 | stick with union on MultiFDSendData.. as it won't play well with such | ||
9 | flexibility. | ||
10 | |||
11 | Switch MultiFDSendData to a struct. | ||
12 | |||
13 | It won't consume a lot more space in reality, after all the real buffers | ||
14 | were already dynamically allocated, so it's so far only about the two | ||
15 | structs (pages, device_state) that will be duplicated, but they're small. | ||
16 | |||
17 | With this, we can remove the pretty hard to understand alloc size logic. | ||
18 | Because now we can allocate offset[] together with the SendData, and | ||
19 | properly free it when the SendData is freed. | ||
20 | |||
21 | [MSS: Make sure to clear possible device state payload before freeing | ||
22 | MultiFDSendData, remove placeholders for other patches not included] | ||
23 | |||
24 | Signed-off-by: Peter Xu <peterx@redhat.com> | ||
25 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | ||
26 | Acked-by: Fabiano Rosas <farosas@suse.de> | ||
27 | Link: https://lore.kernel.org/qemu-devel/7b02baba8e6ddb23ef7c349d312b9b631db09d7e.1741124640.git.maciej.szmigiero@oracle.com | ||
28 | Signed-off-by: Cédric Le Goater <clg@redhat.com> | ||
29 | --- | ||
30 | migration/multifd.h | 15 +++++++++------ | ||
31 | migration/multifd-device-state.c | 5 ----- | ||
32 | migration/multifd-nocomp.c | 13 ++++++------- | ||
33 | migration/multifd.c | 25 +++++++------------------ | ||
34 | 4 files changed, 22 insertions(+), 36 deletions(-) | ||
35 | |||
36 | diff --git a/migration/multifd.h b/migration/multifd.h | ||
37 | index XXXXXXX..XXXXXXX 100644 | ||
38 | --- a/migration/multifd.h | ||
39 | +++ b/migration/multifd.h | ||
40 | @@ -XXX,XX +XXX,XX @@ typedef struct { | ||
41 | uint32_t num; | ||
42 | /* number of normal pages */ | ||
43 | uint32_t normal_num; | ||
44 | + /* | ||
45 | + * Pointer to the ramblock. NOTE: it's caller's responsibility to make | ||
46 | + * sure the pointer is always valid! | ||
47 | + */ | ||
48 | RAMBlock *block; | ||
49 | - /* offset of each page */ | ||
50 | - ram_addr_t offset[]; | ||
51 | + /* offset array of each page, managed by multifd */ | ||
52 | + ram_addr_t *offset; | ||
53 | } MultiFDPages_t; | ||
54 | |||
55 | struct MultiFDRecvData { | ||
56 | @@ -XXX,XX +XXX,XX @@ typedef enum { | ||
57 | MULTIFD_PAYLOAD_DEVICE_STATE, | ||
58 | } MultiFDPayloadType; | ||
59 | |||
60 | -typedef union MultiFDPayload { | ||
61 | +typedef struct MultiFDPayload { | ||
62 | MultiFDPages_t ram; | ||
63 | MultiFDDeviceState_t device_state; | ||
64 | } MultiFDPayload; | ||
65 | @@ -XXX,XX +XXX,XX @@ void multifd_ram_save_cleanup(void); | ||
66 | int multifd_ram_flush_and_sync(QEMUFile *f); | ||
67 | bool multifd_ram_sync_per_round(void); | ||
68 | bool multifd_ram_sync_per_section(void); | ||
69 | -size_t multifd_ram_payload_size(void); | ||
70 | +void multifd_ram_payload_alloc(MultiFDPages_t *pages); | ||
71 | +void multifd_ram_payload_free(MultiFDPages_t *pages); | ||
72 | void multifd_ram_fill_packet(MultiFDSendParams *p); | ||
73 | int multifd_ram_unfill_packet(MultiFDRecvParams *p, Error **errp); | ||
74 | |||
75 | -size_t multifd_device_state_payload_size(void); | ||
76 | - | ||
77 | void multifd_send_data_clear_device_state(MultiFDDeviceState_t *device_state); | ||
78 | |||
79 | void multifd_device_state_send_setup(void); | ||
80 | diff --git a/migration/multifd-device-state.c b/migration/multifd-device-state.c | ||
81 | index XXXXXXX..XXXXXXX 100644 | ||
82 | --- a/migration/multifd-device-state.c | ||
83 | +++ b/migration/multifd-device-state.c | ||
84 | @@ -XXX,XX +XXX,XX @@ static struct { | ||
85 | MultiFDSendData *send_data; | ||
86 | } *multifd_send_device_state; | ||
87 | |||
88 | -size_t multifd_device_state_payload_size(void) | ||
89 | -{ | ||
90 | - return sizeof(MultiFDDeviceState_t); | ||
91 | -} | ||
92 | - | ||
93 | void multifd_device_state_send_setup(void) | ||
94 | { | ||
95 | assert(!multifd_send_device_state); | ||
96 | diff --git a/migration/multifd-nocomp.c b/migration/multifd-nocomp.c | ||
97 | index XXXXXXX..XXXXXXX 100644 | ||
98 | --- a/migration/multifd-nocomp.c | ||
99 | +++ b/migration/multifd-nocomp.c | ||
100 | @@ -XXX,XX +XXX,XX @@ | ||
101 | |||
102 | static MultiFDSendData *multifd_ram_send; | ||
103 | |||
104 | -size_t multifd_ram_payload_size(void) | ||
105 | +void multifd_ram_payload_alloc(MultiFDPages_t *pages) | ||
106 | { | ||
107 | - uint32_t n = multifd_ram_page_count(); | ||
108 | + pages->offset = g_new0(ram_addr_t, multifd_ram_page_count()); | ||
109 | +} | ||
110 | |||
111 | - /* | ||
112 | - * We keep an array of page offsets at the end of MultiFDPages_t, | ||
113 | - * add space for it in the allocation. | ||
114 | - */ | ||
115 | - return sizeof(MultiFDPages_t) + n * sizeof(ram_addr_t); | ||
116 | +void multifd_ram_payload_free(MultiFDPages_t *pages) | ||
117 | +{ | ||
118 | + g_clear_pointer(&pages->offset, g_free); | ||
119 | } | ||
120 | |||
121 | void multifd_ram_save_setup(void) | ||
122 | diff --git a/migration/multifd.c b/migration/multifd.c | ||
123 | index XXXXXXX..XXXXXXX 100644 | ||
124 | --- a/migration/multifd.c | ||
125 | +++ b/migration/multifd.c | ||
126 | @@ -XXX,XX +XXX,XX @@ struct { | ||
127 | |||
128 | MultiFDSendData *multifd_send_data_alloc(void) | ||
129 | { | ||
130 | - size_t max_payload_size, size_minus_payload; | ||
131 | + MultiFDSendData *new = g_new0(MultiFDSendData, 1); | ||
132 | |||
133 | - /* | ||
134 | - * MultiFDPages_t has a flexible array at the end, account for it | ||
135 | - * when allocating MultiFDSendData. Use max() in case other types | ||
136 | - * added to the union in the future are larger than | ||
137 | - * (MultiFDPages_t + flex array). | ||
138 | - */ | ||
139 | - max_payload_size = MAX(multifd_ram_payload_size(), | ||
140 | - multifd_device_state_payload_size()); | ||
141 | - max_payload_size = MAX(max_payload_size, sizeof(MultiFDPayload)); | ||
142 | - | ||
143 | - /* | ||
144 | - * Account for any holes the compiler might insert. We can't pack | ||
145 | - * the structure because that misaligns the members and triggers | ||
146 | - * Waddress-of-packed-member. | ||
147 | - */ | ||
148 | - size_minus_payload = sizeof(MultiFDSendData) - sizeof(MultiFDPayload); | ||
149 | + multifd_ram_payload_alloc(&new->u.ram); | ||
150 | + /* Device state allocates its payload on-demand */ | ||
151 | |||
152 | - return g_malloc0(size_minus_payload + max_payload_size); | ||
153 | + return new; | ||
154 | } | ||
155 | |||
156 | void multifd_send_data_clear(MultiFDSendData *data) | ||
157 | @@ -XXX,XX +XXX,XX @@ void multifd_send_data_free(MultiFDSendData *data) | ||
158 | return; | ||
159 | } | ||
160 | |||
161 | + /* This also free's device state payload */ | ||
162 | multifd_send_data_clear(data); | ||
163 | |||
164 | + multifd_ram_payload_free(&data->u.ram); | ||
165 | + | ||
166 | g_free(data); | ||
167 | } | ||
168 | |||
169 | -- | ||
170 | 2.48.1 | ||
171 | |||
172 | diff view generated by jsdifflib |
Deleted patch | |||
---|---|---|---|
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | ||
2 | 1 | ||
3 | Since device state transfer via multifd channels requires multifd | ||
4 | channels with packets and is currently not compatible with multifd | ||
5 | compression add an appropriate query function so device can learn | ||
6 | whether it can actually make use of it. | ||
7 | |||
8 | Reviewed-by: Fabiano Rosas <farosas@suse.de> | ||
9 | Reviewed-by: Peter Xu <peterx@redhat.com> | ||
10 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | ||
11 | Link: https://lore.kernel.org/qemu-devel/1ff0d98b85f470e5a33687406e877583b8fab74e.1741124640.git.maciej.szmigiero@oracle.com | ||
12 | Signed-off-by: Cédric Le Goater <clg@redhat.com> | ||
13 | --- | ||
14 | include/migration/misc.h | 1 + | ||
15 | migration/multifd-device-state.c | 7 +++++++ | ||
16 | 2 files changed, 8 insertions(+) | ||
17 | |||
18 | diff --git a/include/migration/misc.h b/include/migration/misc.h | ||
19 | index XXXXXXX..XXXXXXX 100644 | ||
20 | --- a/include/migration/misc.h | ||
21 | +++ b/include/migration/misc.h | ||
22 | @@ -XXX,XX +XXX,XX @@ bool migrate_uri_parse(const char *uri, MigrationChannel **channel, | ||
23 | /* migration/multifd-device-state.c */ | ||
24 | bool multifd_queue_device_state(char *idstr, uint32_t instance_id, | ||
25 | char *data, size_t len); | ||
26 | +bool multifd_device_state_supported(void); | ||
27 | |||
28 | #endif | ||
29 | diff --git a/migration/multifd-device-state.c b/migration/multifd-device-state.c | ||
30 | index XXXXXXX..XXXXXXX 100644 | ||
31 | --- a/migration/multifd-device-state.c | ||
32 | +++ b/migration/multifd-device-state.c | ||
33 | @@ -XXX,XX +XXX,XX @@ | ||
34 | #include "qemu/lockable.h" | ||
35 | #include "migration/misc.h" | ||
36 | #include "multifd.h" | ||
37 | +#include "options.h" | ||
38 | |||
39 | static struct { | ||
40 | QemuMutex queue_job_mutex; | ||
41 | @@ -XXX,XX +XXX,XX @@ bool multifd_queue_device_state(char *idstr, uint32_t instance_id, | ||
42 | |||
43 | return true; | ||
44 | } | ||
45 | + | ||
46 | +bool multifd_device_state_supported(void) | ||
47 | +{ | ||
48 | + return migrate_multifd() && !migrate_mapped_ram() && | ||
49 | + migrate_multifd_compression() == MULTIFD_COMPRESSION_NONE; | ||
50 | +} | ||
51 | -- | ||
52 | 2.48.1 | ||
53 | |||
54 | diff view generated by jsdifflib |
Deleted patch | |||
---|---|---|---|
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | ||
2 | 1 | ||
3 | This SaveVMHandler helps device provide its own asynchronous transmission | ||
4 | of the remaining data at the end of a precopy phase via multifd channels, | ||
5 | in parallel with the transfer done by save_live_complete_precopy handlers. | ||
6 | |||
7 | These threads are launched only when multifd device state transfer is | ||
8 | supported. | ||
9 | |||
10 | Management of these threads in done in the multifd migration code, | ||
11 | wrapping them in the generic thread pool. | ||
12 | |||
13 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | ||
14 | Reviewed-by: Peter Xu <peterx@redhat.com> | ||
15 | Link: https://lore.kernel.org/qemu-devel/eac74a4ca7edd8968bbf72aa07b9041c76364a16.1741124640.git.maciej.szmigiero@oracle.com | ||
16 | Signed-off-by: Cédric Le Goater <clg@redhat.com> | ||
17 | --- | ||
18 | include/migration/misc.h | 17 ++++++ | ||
19 | include/migration/register.h | 19 +++++++ | ||
20 | include/qemu/typedefs.h | 3 ++ | ||
21 | migration/multifd-device-state.c | 92 ++++++++++++++++++++++++++++++++ | ||
22 | migration/savevm.c | 40 +++++++++++++- | ||
23 | 5 files changed, 170 insertions(+), 1 deletion(-) | ||
24 | |||
25 | diff --git a/include/migration/misc.h b/include/migration/misc.h | ||
26 | index XXXXXXX..XXXXXXX 100644 | ||
27 | --- a/include/migration/misc.h | ||
28 | +++ b/include/migration/misc.h | ||
29 | @@ -XXX,XX +XXX,XX @@ bool migrate_uri_parse(const char *uri, MigrationChannel **channel, | ||
30 | Error **errp); | ||
31 | |||
32 | /* migration/multifd-device-state.c */ | ||
33 | +typedef struct SaveLiveCompletePrecopyThreadData { | ||
34 | + SaveLiveCompletePrecopyThreadHandler hdlr; | ||
35 | + char *idstr; | ||
36 | + uint32_t instance_id; | ||
37 | + void *handler_opaque; | ||
38 | +} SaveLiveCompletePrecopyThreadData; | ||
39 | + | ||
40 | bool multifd_queue_device_state(char *idstr, uint32_t instance_id, | ||
41 | char *data, size_t len); | ||
42 | bool multifd_device_state_supported(void); | ||
43 | |||
44 | +void | ||
45 | +multifd_spawn_device_state_save_thread(SaveLiveCompletePrecopyThreadHandler hdlr, | ||
46 | + char *idstr, uint32_t instance_id, | ||
47 | + void *opaque); | ||
48 | + | ||
49 | +bool multifd_device_state_save_thread_should_exit(void); | ||
50 | + | ||
51 | +void multifd_abort_device_state_save_threads(void); | ||
52 | +bool multifd_join_device_state_save_threads(void); | ||
53 | + | ||
54 | #endif | ||
55 | diff --git a/include/migration/register.h b/include/migration/register.h | ||
56 | index XXXXXXX..XXXXXXX 100644 | ||
57 | --- a/include/migration/register.h | ||
58 | +++ b/include/migration/register.h | ||
59 | @@ -XXX,XX +XXX,XX @@ typedef struct SaveVMHandlers { | ||
60 | */ | ||
61 | int (*save_live_complete_precopy)(QEMUFile *f, void *opaque); | ||
62 | |||
63 | + /** | ||
64 | + * @save_live_complete_precopy_thread (invoked in a separate thread) | ||
65 | + * | ||
66 | + * Called at the end of a precopy phase from a separate worker thread | ||
67 | + * in configurations where multifd device state transfer is supported | ||
68 | + * in order to perform asynchronous transmission of the remaining data in | ||
69 | + * parallel with @save_live_complete_precopy handlers. | ||
70 | + * When postcopy is enabled, devices that support postcopy will skip this | ||
71 | + * step. | ||
72 | + * | ||
73 | + * @d: a #SaveLiveCompletePrecopyThreadData containing parameters that the | ||
74 | + * handler may need, including this device section idstr and instance_id, | ||
75 | + * and opaque data pointer passed to register_savevm_live(). | ||
76 | + * @errp: pointer to Error*, to store an error if it happens. | ||
77 | + * | ||
78 | + * Returns true to indicate success and false for errors. | ||
79 | + */ | ||
80 | + SaveLiveCompletePrecopyThreadHandler save_live_complete_precopy_thread; | ||
81 | + | ||
82 | /* This runs both outside and inside the BQL. */ | ||
83 | |||
84 | /** | ||
85 | diff --git a/include/qemu/typedefs.h b/include/qemu/typedefs.h | ||
86 | index XXXXXXX..XXXXXXX 100644 | ||
87 | --- a/include/qemu/typedefs.h | ||
88 | +++ b/include/qemu/typedefs.h | ||
89 | @@ -XXX,XX +XXX,XX @@ typedef struct QString QString; | ||
90 | typedef struct RAMBlock RAMBlock; | ||
91 | typedef struct Range Range; | ||
92 | typedef struct ReservedRegion ReservedRegion; | ||
93 | +typedef struct SaveLiveCompletePrecopyThreadData SaveLiveCompletePrecopyThreadData; | ||
94 | typedef struct SHPCDevice SHPCDevice; | ||
95 | typedef struct SSIBus SSIBus; | ||
96 | typedef struct TCGCPUOps TCGCPUOps; | ||
97 | @@ -XXX,XX +XXX,XX @@ typedef struct IRQState *qemu_irq; | ||
98 | typedef void (*qemu_irq_handler)(void *opaque, int n, int level); | ||
99 | typedef bool (*MigrationLoadThread)(void *opaque, bool *should_quit, | ||
100 | Error **errp); | ||
101 | +typedef bool (*SaveLiveCompletePrecopyThreadHandler)(SaveLiveCompletePrecopyThreadData *d, | ||
102 | + Error **errp); | ||
103 | |||
104 | #endif /* QEMU_TYPEDEFS_H */ | ||
105 | diff --git a/migration/multifd-device-state.c b/migration/multifd-device-state.c | ||
106 | index XXXXXXX..XXXXXXX 100644 | ||
107 | --- a/migration/multifd-device-state.c | ||
108 | +++ b/migration/multifd-device-state.c | ||
109 | @@ -XXX,XX +XXX,XX @@ | ||
110 | */ | ||
111 | |||
112 | #include "qemu/osdep.h" | ||
113 | +#include "qapi/error.h" | ||
114 | #include "qemu/lockable.h" | ||
115 | +#include "block/thread-pool.h" | ||
116 | +#include "migration.h" | ||
117 | #include "migration/misc.h" | ||
118 | #include "multifd.h" | ||
119 | #include "options.h" | ||
120 | @@ -XXX,XX +XXX,XX @@ static struct { | ||
121 | QemuMutex queue_job_mutex; | ||
122 | |||
123 | MultiFDSendData *send_data; | ||
124 | + | ||
125 | + ThreadPool *threads; | ||
126 | + bool threads_abort; | ||
127 | } *multifd_send_device_state; | ||
128 | |||
129 | void multifd_device_state_send_setup(void) | ||
130 | @@ -XXX,XX +XXX,XX @@ void multifd_device_state_send_setup(void) | ||
131 | qemu_mutex_init(&multifd_send_device_state->queue_job_mutex); | ||
132 | |||
133 | multifd_send_device_state->send_data = multifd_send_data_alloc(); | ||
134 | + | ||
135 | + multifd_send_device_state->threads = thread_pool_new(); | ||
136 | + multifd_send_device_state->threads_abort = false; | ||
137 | } | ||
138 | |||
139 | void multifd_device_state_send_cleanup(void) | ||
140 | { | ||
141 | + g_clear_pointer(&multifd_send_device_state->threads, thread_pool_free); | ||
142 | g_clear_pointer(&multifd_send_device_state->send_data, | ||
143 | multifd_send_data_free); | ||
144 | |||
145 | @@ -XXX,XX +XXX,XX @@ bool multifd_device_state_supported(void) | ||
146 | return migrate_multifd() && !migrate_mapped_ram() && | ||
147 | migrate_multifd_compression() == MULTIFD_COMPRESSION_NONE; | ||
148 | } | ||
149 | + | ||
150 | +static void multifd_device_state_save_thread_data_free(void *opaque) | ||
151 | +{ | ||
152 | + SaveLiveCompletePrecopyThreadData *data = opaque; | ||
153 | + | ||
154 | + g_clear_pointer(&data->idstr, g_free); | ||
155 | + g_free(data); | ||
156 | +} | ||
157 | + | ||
158 | +static int multifd_device_state_save_thread(void *opaque) | ||
159 | +{ | ||
160 | + SaveLiveCompletePrecopyThreadData *data = opaque; | ||
161 | + g_autoptr(Error) local_err = NULL; | ||
162 | + | ||
163 | + if (!data->hdlr(data, &local_err)) { | ||
164 | + MigrationState *s = migrate_get_current(); | ||
165 | + | ||
166 | + /* | ||
167 | + * Can't call abort_device_state_save_threads() here since new | ||
168 | + * save threads could still be in process of being launched | ||
169 | + * (if, for example, the very first save thread launched exited | ||
170 | + * with an error very quickly). | ||
171 | + */ | ||
172 | + | ||
173 | + assert(local_err); | ||
174 | + | ||
175 | + /* | ||
176 | + * In case of multiple save threads failing which thread error | ||
177 | + * return we end setting is purely arbitrary. | ||
178 | + */ | ||
179 | + migrate_set_error(s, local_err); | ||
180 | + } | ||
181 | + | ||
182 | + return 0; | ||
183 | +} | ||
184 | + | ||
185 | +bool multifd_device_state_save_thread_should_exit(void) | ||
186 | +{ | ||
187 | + return qatomic_read(&multifd_send_device_state->threads_abort); | ||
188 | +} | ||
189 | + | ||
190 | +void | ||
191 | +multifd_spawn_device_state_save_thread(SaveLiveCompletePrecopyThreadHandler hdlr, | ||
192 | + char *idstr, uint32_t instance_id, | ||
193 | + void *opaque) | ||
194 | +{ | ||
195 | + SaveLiveCompletePrecopyThreadData *data; | ||
196 | + | ||
197 | + assert(multifd_device_state_supported()); | ||
198 | + assert(multifd_send_device_state); | ||
199 | + | ||
200 | + assert(!qatomic_read(&multifd_send_device_state->threads_abort)); | ||
201 | + | ||
202 | + data = g_new(SaveLiveCompletePrecopyThreadData, 1); | ||
203 | + data->hdlr = hdlr; | ||
204 | + data->idstr = g_strdup(idstr); | ||
205 | + data->instance_id = instance_id; | ||
206 | + data->handler_opaque = opaque; | ||
207 | + | ||
208 | + thread_pool_submit_immediate(multifd_send_device_state->threads, | ||
209 | + multifd_device_state_save_thread, | ||
210 | + data, | ||
211 | + multifd_device_state_save_thread_data_free); | ||
212 | +} | ||
213 | + | ||
214 | +void multifd_abort_device_state_save_threads(void) | ||
215 | +{ | ||
216 | + assert(multifd_device_state_supported()); | ||
217 | + | ||
218 | + qatomic_set(&multifd_send_device_state->threads_abort, true); | ||
219 | +} | ||
220 | + | ||
221 | +bool multifd_join_device_state_save_threads(void) | ||
222 | +{ | ||
223 | + MigrationState *s = migrate_get_current(); | ||
224 | + | ||
225 | + assert(multifd_device_state_supported()); | ||
226 | + | ||
227 | + thread_pool_wait(multifd_send_device_state->threads); | ||
228 | + | ||
229 | + return !migrate_has_error(s); | ||
230 | +} | ||
231 | diff --git a/migration/savevm.c b/migration/savevm.c | ||
232 | index XXXXXXX..XXXXXXX 100644 | ||
233 | --- a/migration/savevm.c | ||
234 | +++ b/migration/savevm.c | ||
235 | @@ -XXX,XX +XXX,XX @@ | ||
236 | #include "migration/register.h" | ||
237 | #include "migration/global_state.h" | ||
238 | #include "migration/channel-block.h" | ||
239 | +#include "multifd.h" | ||
240 | #include "ram.h" | ||
241 | #include "qemu-file.h" | ||
242 | #include "savevm.h" | ||
243 | @@ -XXX,XX +XXX,XX @@ int qemu_savevm_state_complete_precopy_iterable(QEMUFile *f, bool in_postcopy) | ||
244 | int64_t start_ts_each, end_ts_each; | ||
245 | SaveStateEntry *se; | ||
246 | int ret; | ||
247 | + bool multifd_device_state = multifd_device_state_supported(); | ||
248 | + | ||
249 | + if (multifd_device_state) { | ||
250 | + QTAILQ_FOREACH(se, &savevm_state.handlers, entry) { | ||
251 | + SaveLiveCompletePrecopyThreadHandler hdlr; | ||
252 | + | ||
253 | + if (!se->ops || (in_postcopy && se->ops->has_postcopy && | ||
254 | + se->ops->has_postcopy(se->opaque)) || | ||
255 | + !se->ops->save_live_complete_precopy_thread) { | ||
256 | + continue; | ||
257 | + } | ||
258 | + | ||
259 | + hdlr = se->ops->save_live_complete_precopy_thread; | ||
260 | + multifd_spawn_device_state_save_thread(hdlr, | ||
261 | + se->idstr, se->instance_id, | ||
262 | + se->opaque); | ||
263 | + } | ||
264 | + } | ||
265 | |||
266 | QTAILQ_FOREACH(se, &savevm_state.handlers, entry) { | ||
267 | if (!se->ops || | ||
268 | @@ -XXX,XX +XXX,XX @@ int qemu_savevm_state_complete_precopy_iterable(QEMUFile *f, bool in_postcopy) | ||
269 | save_section_footer(f, se); | ||
270 | if (ret < 0) { | ||
271 | qemu_file_set_error(f, ret); | ||
272 | - return -1; | ||
273 | + goto ret_fail_abort_threads; | ||
274 | } | ||
275 | end_ts_each = qemu_clock_get_us(QEMU_CLOCK_REALTIME); | ||
276 | trace_vmstate_downtime_save("iterable", se->idstr, se->instance_id, | ||
277 | end_ts_each - start_ts_each); | ||
278 | } | ||
279 | |||
280 | + if (multifd_device_state) { | ||
281 | + if (migrate_has_error(migrate_get_current())) { | ||
282 | + multifd_abort_device_state_save_threads(); | ||
283 | + } | ||
284 | + | ||
285 | + if (!multifd_join_device_state_save_threads()) { | ||
286 | + qemu_file_set_error(f, -EINVAL); | ||
287 | + return -1; | ||
288 | + } | ||
289 | + } | ||
290 | + | ||
291 | trace_vmstate_downtime_checkpoint("src-iterable-saved"); | ||
292 | |||
293 | return 0; | ||
294 | + | ||
295 | +ret_fail_abort_threads: | ||
296 | + if (multifd_device_state) { | ||
297 | + multifd_abort_device_state_save_threads(); | ||
298 | + multifd_join_device_state_save_threads(); | ||
299 | + } | ||
300 | + | ||
301 | + return -1; | ||
302 | } | ||
303 | |||
304 | int qemu_savevm_state_complete_precopy_non_iterable(QEMUFile *f, | ||
305 | -- | ||
306 | 2.48.1 | ||
307 | |||
308 | diff view generated by jsdifflib |
Deleted patch | |||
---|---|---|---|
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | ||
2 | 1 | ||
3 | And rename existing load_device_config_state trace event to | ||
4 | load_device_config_state_end for consistency since it is triggered at the | ||
5 | end of loading of the VFIO device config state. | ||
6 | |||
7 | This way both the start and end points of particular device config | ||
8 | loading operation (a long, BQL-serialized operation) are known. | ||
9 | |||
10 | Reviewed-by: Cédric Le Goater <clg@redhat.com> | ||
11 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | ||
12 | Link: https://lore.kernel.org/qemu-devel/1b6c5a2097e64c272eb7e53f9e4cca4b79581b38.1741124640.git.maciej.szmigiero@oracle.com | ||
13 | Signed-off-by: Cédric Le Goater <clg@redhat.com> | ||
14 | --- | ||
15 | hw/vfio/migration.c | 4 +++- | ||
16 | hw/vfio/trace-events | 3 ++- | ||
17 | 2 files changed, 5 insertions(+), 2 deletions(-) | ||
18 | |||
19 | diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c | ||
20 | index XXXXXXX..XXXXXXX 100644 | ||
21 | --- a/hw/vfio/migration.c | ||
22 | +++ b/hw/vfio/migration.c | ||
23 | @@ -XXX,XX +XXX,XX @@ static int vfio_load_device_config_state(QEMUFile *f, void *opaque) | ||
24 | VFIODevice *vbasedev = opaque; | ||
25 | uint64_t data; | ||
26 | |||
27 | + trace_vfio_load_device_config_state_start(vbasedev->name); | ||
28 | + | ||
29 | if (vbasedev->ops && vbasedev->ops->vfio_load_config) { | ||
30 | int ret; | ||
31 | |||
32 | @@ -XXX,XX +XXX,XX @@ static int vfio_load_device_config_state(QEMUFile *f, void *opaque) | ||
33 | return -EINVAL; | ||
34 | } | ||
35 | |||
36 | - trace_vfio_load_device_config_state(vbasedev->name); | ||
37 | + trace_vfio_load_device_config_state_end(vbasedev->name); | ||
38 | return qemu_file_get_error(f); | ||
39 | } | ||
40 | |||
41 | diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events | ||
42 | index XXXXXXX..XXXXXXX 100644 | ||
43 | --- a/hw/vfio/trace-events | ||
44 | +++ b/hw/vfio/trace-events | ||
45 | @@ -XXX,XX +XXX,XX @@ vfio_display_edid_write_error(void) "" | ||
46 | |||
47 | # migration.c | ||
48 | vfio_load_cleanup(const char *name) " (%s)" | ||
49 | -vfio_load_device_config_state(const char *name) " (%s)" | ||
50 | +vfio_load_device_config_state_start(const char *name) " (%s)" | ||
51 | +vfio_load_device_config_state_end(const char *name) " (%s)" | ||
52 | vfio_load_state(const char *name, uint64_t data) " (%s) data 0x%"PRIx64 | ||
53 | vfio_load_state_device_data(const char *name, uint64_t data_size, int ret) " (%s) size %"PRIu64" ret %d" | ||
54 | vfio_migration_realize(const char *name) " (%s)" | ||
55 | -- | ||
56 | 2.48.1 | ||
57 | |||
58 | diff view generated by jsdifflib |
Deleted patch | |||
---|---|---|---|
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | ||
2 | 1 | ||
3 | So it can be safety accessed from multiple threads. | ||
4 | |||
5 | This variable type needs to be changed to unsigned long since | ||
6 | 32-bit host platforms lack the necessary addition atomics on 64-bit | ||
7 | variables. | ||
8 | |||
9 | Using 32-bit counters on 32-bit host platforms should not be a problem | ||
10 | in practice since they can't realistically address more memory anyway. | ||
11 | |||
12 | Reviewed-by: Cédric Le Goater <clg@redhat.com> | ||
13 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | ||
14 | Link: https://lore.kernel.org/qemu-devel/dc391771d2d9ad0f311994f0cb9e666da564aeaf.1741124640.git.maciej.szmigiero@oracle.com | ||
15 | Signed-off-by: Cédric Le Goater <clg@redhat.com> | ||
16 | --- | ||
17 | hw/vfio/migration.c | 8 ++++---- | ||
18 | 1 file changed, 4 insertions(+), 4 deletions(-) | ||
19 | |||
20 | diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c | ||
21 | index XXXXXXX..XXXXXXX 100644 | ||
22 | --- a/hw/vfio/migration.c | ||
23 | +++ b/hw/vfio/migration.c | ||
24 | @@ -XXX,XX +XXX,XX @@ | ||
25 | */ | ||
26 | #define VFIO_MIG_DEFAULT_DATA_BUFFER_SIZE (1 * MiB) | ||
27 | |||
28 | -static int64_t bytes_transferred; | ||
29 | +static unsigned long bytes_transferred; | ||
30 | |||
31 | static const char *mig_state_to_str(enum vfio_device_mig_state state) | ||
32 | { | ||
33 | @@ -XXX,XX +XXX,XX @@ static ssize_t vfio_save_block(QEMUFile *f, VFIOMigration *migration) | ||
34 | qemu_put_be64(f, VFIO_MIG_FLAG_DEV_DATA_STATE); | ||
35 | qemu_put_be64(f, data_size); | ||
36 | qemu_put_buffer(f, migration->data_buffer, data_size); | ||
37 | - bytes_transferred += data_size; | ||
38 | + qatomic_add(&bytes_transferred, data_size); | ||
39 | |||
40 | trace_vfio_save_block(migration->vbasedev->name, data_size); | ||
41 | |||
42 | @@ -XXX,XX +XXX,XX @@ static int vfio_block_migration(VFIODevice *vbasedev, Error *err, Error **errp) | ||
43 | |||
44 | int64_t vfio_mig_bytes_transferred(void) | ||
45 | { | ||
46 | - return bytes_transferred; | ||
47 | + return MIN(qatomic_read(&bytes_transferred), INT64_MAX); | ||
48 | } | ||
49 | |||
50 | void vfio_reset_bytes_transferred(void) | ||
51 | { | ||
52 | - bytes_transferred = 0; | ||
53 | + qatomic_set(&bytes_transferred, 0); | ||
54 | } | ||
55 | |||
56 | /* | ||
57 | -- | ||
58 | 2.48.1 | ||
59 | |||
60 | diff view generated by jsdifflib |
Deleted patch | |||
---|---|---|---|
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | ||
2 | 1 | ||
3 | This way bytes_transferred can also be incremented in other translation | ||
4 | units than migration.c. | ||
5 | |||
6 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | ||
7 | Reviewed-by: Cédric Le Goater <clg@redhat.com> | ||
8 | Link: https://lore.kernel.org/qemu-devel/d1fbc27ac2417b49892f354ba20f6c6b3f7209f8.1741124640.git.maciej.szmigiero@oracle.com | ||
9 | Signed-off-by: Cédric Le Goater <clg@redhat.com> | ||
10 | --- | ||
11 | include/hw/vfio/vfio-common.h | 1 + | ||
12 | hw/vfio/migration.c | 7 ++++++- | ||
13 | 2 files changed, 7 insertions(+), 1 deletion(-) | ||
14 | |||
15 | diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h | ||
16 | index XXXXXXX..XXXXXXX 100644 | ||
17 | --- a/include/hw/vfio/vfio-common.h | ||
18 | +++ b/include/hw/vfio/vfio-common.h | ||
19 | @@ -XXX,XX +XXX,XX @@ void vfio_unblock_multiple_devices_migration(void); | ||
20 | bool vfio_viommu_preset(VFIODevice *vbasedev); | ||
21 | int64_t vfio_mig_bytes_transferred(void); | ||
22 | void vfio_reset_bytes_transferred(void); | ||
23 | +void vfio_mig_add_bytes_transferred(unsigned long val); | ||
24 | bool vfio_device_state_is_running(VFIODevice *vbasedev); | ||
25 | bool vfio_device_state_is_precopy(VFIODevice *vbasedev); | ||
26 | |||
27 | diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c | ||
28 | index XXXXXXX..XXXXXXX 100644 | ||
29 | --- a/hw/vfio/migration.c | ||
30 | +++ b/hw/vfio/migration.c | ||
31 | @@ -XXX,XX +XXX,XX @@ static ssize_t vfio_save_block(QEMUFile *f, VFIOMigration *migration) | ||
32 | qemu_put_be64(f, VFIO_MIG_FLAG_DEV_DATA_STATE); | ||
33 | qemu_put_be64(f, data_size); | ||
34 | qemu_put_buffer(f, migration->data_buffer, data_size); | ||
35 | - qatomic_add(&bytes_transferred, data_size); | ||
36 | + vfio_mig_add_bytes_transferred(data_size); | ||
37 | |||
38 | trace_vfio_save_block(migration->vbasedev->name, data_size); | ||
39 | |||
40 | @@ -XXX,XX +XXX,XX @@ void vfio_reset_bytes_transferred(void) | ||
41 | qatomic_set(&bytes_transferred, 0); | ||
42 | } | ||
43 | |||
44 | +void vfio_mig_add_bytes_transferred(unsigned long val) | ||
45 | +{ | ||
46 | + qatomic_add(&bytes_transferred, val); | ||
47 | +} | ||
48 | + | ||
49 | /* | ||
50 | * Return true when either migration initialized or blocker registered. | ||
51 | * Currently only return false when adding blocker fails which will | ||
52 | -- | ||
53 | 2.48.1 | ||
54 | |||
55 | diff view generated by jsdifflib |
Deleted patch | |||
---|---|---|---|
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | ||
2 | 1 | ||
3 | This way they can also be referenced in other translation | ||
4 | units than migration.c. | ||
5 | |||
6 | Reviewed-by: Cédric Le Goater <clg@redhat.com> | ||
7 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | ||
8 | Link: https://lore.kernel.org/qemu-devel/26a940f6b22c1b685818251b7a3ddbbca601b1d6.1741124640.git.maciej.szmigiero@oracle.com | ||
9 | Signed-off-by: Cédric Le Goater <clg@redhat.com> | ||
10 | --- | ||
11 | include/hw/vfio/vfio-common.h | 17 +++++++++++++++++ | ||
12 | hw/vfio/migration.c | 17 ----------------- | ||
13 | 2 files changed, 17 insertions(+), 17 deletions(-) | ||
14 | |||
15 | diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h | ||
16 | index XXXXXXX..XXXXXXX 100644 | ||
17 | --- a/include/hw/vfio/vfio-common.h | ||
18 | +++ b/include/hw/vfio/vfio-common.h | ||
19 | @@ -XXX,XX +XXX,XX @@ | ||
20 | |||
21 | #define VFIO_MSG_PREFIX "vfio %s: " | ||
22 | |||
23 | +/* | ||
24 | + * Flags to be used as unique delimiters for VFIO devices in the migration | ||
25 | + * stream. These flags are composed as: | ||
26 | + * 0xffffffff => MSB 32-bit all 1s | ||
27 | + * 0xef10 => Magic ID, represents emulated (virtual) function IO | ||
28 | + * 0x0000 => 16-bits reserved for flags | ||
29 | + * | ||
30 | + * The beginning of state information is marked by _DEV_CONFIG_STATE, | ||
31 | + * _DEV_SETUP_STATE, or _DEV_DATA_STATE, respectively. The end of a | ||
32 | + * certain state information is marked by _END_OF_STATE. | ||
33 | + */ | ||
34 | +#define VFIO_MIG_FLAG_END_OF_STATE (0xffffffffef100001ULL) | ||
35 | +#define VFIO_MIG_FLAG_DEV_CONFIG_STATE (0xffffffffef100002ULL) | ||
36 | +#define VFIO_MIG_FLAG_DEV_SETUP_STATE (0xffffffffef100003ULL) | ||
37 | +#define VFIO_MIG_FLAG_DEV_DATA_STATE (0xffffffffef100004ULL) | ||
38 | +#define VFIO_MIG_FLAG_DEV_INIT_DATA_SENT (0xffffffffef100005ULL) | ||
39 | + | ||
40 | enum { | ||
41 | VFIO_DEVICE_TYPE_PCI = 0, | ||
42 | VFIO_DEVICE_TYPE_PLATFORM = 1, | ||
43 | diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c | ||
44 | index XXXXXXX..XXXXXXX 100644 | ||
45 | --- a/hw/vfio/migration.c | ||
46 | +++ b/hw/vfio/migration.c | ||
47 | @@ -XXX,XX +XXX,XX @@ | ||
48 | #include "trace.h" | ||
49 | #include "hw/hw.h" | ||
50 | |||
51 | -/* | ||
52 | - * Flags to be used as unique delimiters for VFIO devices in the migration | ||
53 | - * stream. These flags are composed as: | ||
54 | - * 0xffffffff => MSB 32-bit all 1s | ||
55 | - * 0xef10 => Magic ID, represents emulated (virtual) function IO | ||
56 | - * 0x0000 => 16-bits reserved for flags | ||
57 | - * | ||
58 | - * The beginning of state information is marked by _DEV_CONFIG_STATE, | ||
59 | - * _DEV_SETUP_STATE, or _DEV_DATA_STATE, respectively. The end of a | ||
60 | - * certain state information is marked by _END_OF_STATE. | ||
61 | - */ | ||
62 | -#define VFIO_MIG_FLAG_END_OF_STATE (0xffffffffef100001ULL) | ||
63 | -#define VFIO_MIG_FLAG_DEV_CONFIG_STATE (0xffffffffef100002ULL) | ||
64 | -#define VFIO_MIG_FLAG_DEV_SETUP_STATE (0xffffffffef100003ULL) | ||
65 | -#define VFIO_MIG_FLAG_DEV_DATA_STATE (0xffffffffef100004ULL) | ||
66 | -#define VFIO_MIG_FLAG_DEV_INIT_DATA_SENT (0xffffffffef100005ULL) | ||
67 | - | ||
68 | /* | ||
69 | * This is an arbitrary size based on migration of mlx5 devices, where typically | ||
70 | * total device migration size is on the order of 100s of MB. Testing with | ||
71 | -- | ||
72 | 2.48.1 | ||
73 | |||
74 | diff view generated by jsdifflib |
Deleted patch | |||
---|---|---|---|
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | ||
2 | 1 | ||
3 | Add basic types and flags used by VFIO multifd device state transfer | ||
4 | support. | ||
5 | |||
6 | Since we'll be introducing a lot of multifd transfer specific code, | ||
7 | add a new file migration-multifd.c to home it, wired into main VFIO | ||
8 | migration code (migration.c) via migration-multifd.h header file. | ||
9 | |||
10 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | ||
11 | Reviewed-by: Cédric Le Goater <clg@redhat.com> | ||
12 | Link: https://lore.kernel.org/qemu-devel/4eedd529e6617f80f3d6a66d7268a0db2bc173fa.1741124640.git.maciej.szmigiero@oracle.com | ||
13 | Signed-off-by: Cédric Le Goater <clg@redhat.com> | ||
14 | --- | ||
15 | hw/vfio/migration-multifd.h | 17 +++++++++++++++++ | ||
16 | hw/vfio/migration-multifd.c | 33 +++++++++++++++++++++++++++++++++ | ||
17 | hw/vfio/migration.c | 1 + | ||
18 | hw/vfio/meson.build | 1 + | ||
19 | 4 files changed, 52 insertions(+) | ||
20 | create mode 100644 hw/vfio/migration-multifd.h | ||
21 | create mode 100644 hw/vfio/migration-multifd.c | ||
22 | |||
23 | diff --git a/hw/vfio/migration-multifd.h b/hw/vfio/migration-multifd.h | ||
24 | new file mode 100644 | ||
25 | index XXXXXXX..XXXXXXX | ||
26 | --- /dev/null | ||
27 | +++ b/hw/vfio/migration-multifd.h | ||
28 | @@ -XXX,XX +XXX,XX @@ | ||
29 | +/* | ||
30 | + * Multifd VFIO migration | ||
31 | + * | ||
32 | + * Copyright (C) 2024,2025 Oracle and/or its affiliates. | ||
33 | + * | ||
34 | + * This work is licensed under the terms of the GNU GPL, version 2 or later. | ||
35 | + * See the COPYING file in the top-level directory. | ||
36 | + * | ||
37 | + * SPDX-License-Identifier: GPL-2.0-or-later | ||
38 | + */ | ||
39 | + | ||
40 | +#ifndef HW_VFIO_MIGRATION_MULTIFD_H | ||
41 | +#define HW_VFIO_MIGRATION_MULTIFD_H | ||
42 | + | ||
43 | +#include "hw/vfio/vfio-common.h" | ||
44 | + | ||
45 | +#endif | ||
46 | diff --git a/hw/vfio/migration-multifd.c b/hw/vfio/migration-multifd.c | ||
47 | new file mode 100644 | ||
48 | index XXXXXXX..XXXXXXX | ||
49 | --- /dev/null | ||
50 | +++ b/hw/vfio/migration-multifd.c | ||
51 | @@ -XXX,XX +XXX,XX @@ | ||
52 | +/* | ||
53 | + * Multifd VFIO migration | ||
54 | + * | ||
55 | + * Copyright (C) 2024,2025 Oracle and/or its affiliates. | ||
56 | + * | ||
57 | + * This work is licensed under the terms of the GNU GPL, version 2 or later. | ||
58 | + * See the COPYING file in the top-level directory. | ||
59 | + * | ||
60 | + * SPDX-License-Identifier: GPL-2.0-or-later | ||
61 | + */ | ||
62 | + | ||
63 | +#include "qemu/osdep.h" | ||
64 | +#include "hw/vfio/vfio-common.h" | ||
65 | +#include "migration/misc.h" | ||
66 | +#include "qapi/error.h" | ||
67 | +#include "qemu/error-report.h" | ||
68 | +#include "qemu/lockable.h" | ||
69 | +#include "qemu/main-loop.h" | ||
70 | +#include "qemu/thread.h" | ||
71 | +#include "migration/qemu-file.h" | ||
72 | +#include "migration-multifd.h" | ||
73 | +#include "trace.h" | ||
74 | + | ||
75 | +#define VFIO_DEVICE_STATE_CONFIG_STATE (1) | ||
76 | + | ||
77 | +#define VFIO_DEVICE_STATE_PACKET_VER_CURRENT (0) | ||
78 | + | ||
79 | +typedef struct VFIODeviceStatePacket { | ||
80 | + uint32_t version; | ||
81 | + uint32_t idx; | ||
82 | + uint32_t flags; | ||
83 | + uint8_t data[0]; | ||
84 | +} QEMU_PACKED VFIODeviceStatePacket; | ||
85 | diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c | ||
86 | index XXXXXXX..XXXXXXX 100644 | ||
87 | --- a/hw/vfio/migration.c | ||
88 | +++ b/hw/vfio/migration.c | ||
89 | @@ -XXX,XX +XXX,XX @@ | ||
90 | #include "migration/qemu-file.h" | ||
91 | #include "migration/register.h" | ||
92 | #include "migration/blocker.h" | ||
93 | +#include "migration-multifd.h" | ||
94 | #include "qapi/error.h" | ||
95 | #include "qapi/qapi-events-vfio.h" | ||
96 | #include "exec/ramlist.h" | ||
97 | diff --git a/hw/vfio/meson.build b/hw/vfio/meson.build | ||
98 | index XXXXXXX..XXXXXXX 100644 | ||
99 | --- a/hw/vfio/meson.build | ||
100 | +++ b/hw/vfio/meson.build | ||
101 | @@ -XXX,XX +XXX,XX @@ vfio_ss.add(files( | ||
102 | 'container-base.c', | ||
103 | 'container.c', | ||
104 | 'migration.c', | ||
105 | + 'migration-multifd.c', | ||
106 | 'cpr.c', | ||
107 | )) | ||
108 | vfio_ss.add(when: 'CONFIG_PSERIES', if_true: files('spapr.c')) | ||
109 | -- | ||
110 | 2.48.1 | ||
111 | |||
112 | diff view generated by jsdifflib |
Deleted patch | |||
---|---|---|---|
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | ||
2 | 1 | ||
3 | Add vfio_multifd_transfer_supported() function that tells whether the | ||
4 | multifd device state transfer is supported. | ||
5 | |||
6 | Reviewed-by: Cédric Le Goater <clg@redhat.com> | ||
7 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | ||
8 | Link: https://lore.kernel.org/qemu-devel/8ce50256f341b3d47342bb217cb5fbb2deb14639.1741124640.git.maciej.szmigiero@oracle.com | ||
9 | Signed-off-by: Cédric Le Goater <clg@redhat.com> | ||
10 | --- | ||
11 | hw/vfio/migration-multifd.h | 2 ++ | ||
12 | hw/vfio/migration-multifd.c | 6 ++++++ | ||
13 | 2 files changed, 8 insertions(+) | ||
14 | |||
15 | diff --git a/hw/vfio/migration-multifd.h b/hw/vfio/migration-multifd.h | ||
16 | index XXXXXXX..XXXXXXX 100644 | ||
17 | --- a/hw/vfio/migration-multifd.h | ||
18 | +++ b/hw/vfio/migration-multifd.h | ||
19 | @@ -XXX,XX +XXX,XX @@ | ||
20 | |||
21 | #include "hw/vfio/vfio-common.h" | ||
22 | |||
23 | +bool vfio_multifd_transfer_supported(void); | ||
24 | + | ||
25 | #endif | ||
26 | diff --git a/hw/vfio/migration-multifd.c b/hw/vfio/migration-multifd.c | ||
27 | index XXXXXXX..XXXXXXX 100644 | ||
28 | --- a/hw/vfio/migration-multifd.c | ||
29 | +++ b/hw/vfio/migration-multifd.c | ||
30 | @@ -XXX,XX +XXX,XX @@ typedef struct VFIODeviceStatePacket { | ||
31 | uint32_t flags; | ||
32 | uint8_t data[0]; | ||
33 | } QEMU_PACKED VFIODeviceStatePacket; | ||
34 | + | ||
35 | +bool vfio_multifd_transfer_supported(void) | ||
36 | +{ | ||
37 | + return multifd_device_state_supported() && | ||
38 | + migrate_send_switchover_start(); | ||
39 | +} | ||
40 | -- | ||
41 | 2.48.1 | ||
42 | |||
43 | diff view generated by jsdifflib |
Deleted patch | |||
---|---|---|---|
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | ||
2 | 1 | ||
3 | Add multifd setup/cleanup functions and an associated VFIOMultifd data | ||
4 | structure that will contain most of the receive-side data together | ||
5 | with its init/cleanup methods. | ||
6 | |||
7 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | ||
8 | Reviewed-by: Cédric Le Goater <clg@redhat.com> | ||
9 | Link: https://lore.kernel.org/qemu-devel/c0520523053b1087787152ddf2163257d3030be0.1741124640.git.maciej.szmigiero@oracle.com | ||
10 | Signed-off-by: Cédric Le Goater <clg@redhat.com> | ||
11 | --- | ||
12 | hw/vfio/migration-multifd.h | 4 ++++ | ||
13 | include/hw/vfio/vfio-common.h | 3 +++ | ||
14 | hw/vfio/migration-multifd.c | 44 +++++++++++++++++++++++++++++++++++ | ||
15 | 3 files changed, 51 insertions(+) | ||
16 | |||
17 | diff --git a/hw/vfio/migration-multifd.h b/hw/vfio/migration-multifd.h | ||
18 | index XXXXXXX..XXXXXXX 100644 | ||
19 | --- a/hw/vfio/migration-multifd.h | ||
20 | +++ b/hw/vfio/migration-multifd.h | ||
21 | @@ -XXX,XX +XXX,XX @@ | ||
22 | |||
23 | #include "hw/vfio/vfio-common.h" | ||
24 | |||
25 | +bool vfio_multifd_setup(VFIODevice *vbasedev, bool alloc_multifd, Error **errp); | ||
26 | +void vfio_multifd_cleanup(VFIODevice *vbasedev); | ||
27 | + | ||
28 | bool vfio_multifd_transfer_supported(void); | ||
29 | +bool vfio_multifd_transfer_enabled(VFIODevice *vbasedev); | ||
30 | |||
31 | #endif | ||
32 | diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h | ||
33 | index XXXXXXX..XXXXXXX 100644 | ||
34 | --- a/include/hw/vfio/vfio-common.h | ||
35 | +++ b/include/hw/vfio/vfio-common.h | ||
36 | @@ -XXX,XX +XXX,XX @@ typedef struct VFIORegion { | ||
37 | uint8_t nr; /* cache the region number for debug */ | ||
38 | } VFIORegion; | ||
39 | |||
40 | +typedef struct VFIOMultifd VFIOMultifd; | ||
41 | + | ||
42 | typedef struct VFIOMigration { | ||
43 | struct VFIODevice *vbasedev; | ||
44 | VMChangeStateEntry *vm_state; | ||
45 | @@ -XXX,XX +XXX,XX @@ typedef struct VFIOMigration { | ||
46 | uint64_t mig_flags; | ||
47 | uint64_t precopy_init_size; | ||
48 | uint64_t precopy_dirty_size; | ||
49 | + VFIOMultifd *multifd; | ||
50 | bool initial_data_sent; | ||
51 | |||
52 | bool event_save_iterate_started; | ||
53 | diff --git a/hw/vfio/migration-multifd.c b/hw/vfio/migration-multifd.c | ||
54 | index XXXXXXX..XXXXXXX 100644 | ||
55 | --- a/hw/vfio/migration-multifd.c | ||
56 | +++ b/hw/vfio/migration-multifd.c | ||
57 | @@ -XXX,XX +XXX,XX @@ typedef struct VFIODeviceStatePacket { | ||
58 | uint8_t data[0]; | ||
59 | } QEMU_PACKED VFIODeviceStatePacket; | ||
60 | |||
61 | +typedef struct VFIOMultifd { | ||
62 | +} VFIOMultifd; | ||
63 | + | ||
64 | +static VFIOMultifd *vfio_multifd_new(void) | ||
65 | +{ | ||
66 | + VFIOMultifd *multifd = g_new(VFIOMultifd, 1); | ||
67 | + | ||
68 | + return multifd; | ||
69 | +} | ||
70 | + | ||
71 | +static void vfio_multifd_free(VFIOMultifd *multifd) | ||
72 | +{ | ||
73 | + g_free(multifd); | ||
74 | +} | ||
75 | + | ||
76 | +void vfio_multifd_cleanup(VFIODevice *vbasedev) | ||
77 | +{ | ||
78 | + VFIOMigration *migration = vbasedev->migration; | ||
79 | + | ||
80 | + g_clear_pointer(&migration->multifd, vfio_multifd_free); | ||
81 | +} | ||
82 | + | ||
83 | bool vfio_multifd_transfer_supported(void) | ||
84 | { | ||
85 | return multifd_device_state_supported() && | ||
86 | migrate_send_switchover_start(); | ||
87 | } | ||
88 | + | ||
89 | +bool vfio_multifd_transfer_enabled(VFIODevice *vbasedev) | ||
90 | +{ | ||
91 | + return false; | ||
92 | +} | ||
93 | + | ||
94 | +bool vfio_multifd_setup(VFIODevice *vbasedev, bool alloc_multifd, Error **errp) | ||
95 | +{ | ||
96 | + VFIOMigration *migration = vbasedev->migration; | ||
97 | + | ||
98 | + if (!vfio_multifd_transfer_enabled(vbasedev)) { | ||
99 | + /* Nothing further to check or do */ | ||
100 | + return true; | ||
101 | + } | ||
102 | + | ||
103 | + if (alloc_multifd) { | ||
104 | + assert(!migration->multifd); | ||
105 | + migration->multifd = vfio_multifd_new(); | ||
106 | + } | ||
107 | + | ||
108 | + return true; | ||
109 | +} | ||
110 | -- | ||
111 | 2.48.1 | ||
112 | |||
113 | diff view generated by jsdifflib |
Deleted patch | |||
---|---|---|---|
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | ||
2 | 1 | ||
3 | Wire VFIO multifd transfer specific setup and cleanup functions into | ||
4 | general VFIO load/save setup and cleanup methods. | ||
5 | |||
6 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | ||
7 | Reviewed-by: Cédric Le Goater <clg@redhat.com> | ||
8 | Link: https://lore.kernel.org/qemu-devel/b1f864a65fafd4fdab1f89230df52e46ae41f2ac.1741124640.git.maciej.szmigiero@oracle.com | ||
9 | Signed-off-by: Cédric Le Goater <clg@redhat.com> | ||
10 | --- | ||
11 | hw/vfio/migration.c | 24 ++++++++++++++++++++++-- | ||
12 | 1 file changed, 22 insertions(+), 2 deletions(-) | ||
13 | |||
14 | diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c | ||
15 | index XXXXXXX..XXXXXXX 100644 | ||
16 | --- a/hw/vfio/migration.c | ||
17 | +++ b/hw/vfio/migration.c | ||
18 | @@ -XXX,XX +XXX,XX @@ static int vfio_save_setup(QEMUFile *f, void *opaque, Error **errp) | ||
19 | uint64_t stop_copy_size = VFIO_MIG_DEFAULT_DATA_BUFFER_SIZE; | ||
20 | int ret; | ||
21 | |||
22 | + if (!vfio_multifd_setup(vbasedev, false, errp)) { | ||
23 | + return -EINVAL; | ||
24 | + } | ||
25 | + | ||
26 | qemu_put_be64(f, VFIO_MIG_FLAG_DEV_SETUP_STATE); | ||
27 | |||
28 | vfio_query_stop_copy_size(vbasedev, &stop_copy_size); | ||
29 | @@ -XXX,XX +XXX,XX @@ static void vfio_save_cleanup(void *opaque) | ||
30 | Error *local_err = NULL; | ||
31 | int ret; | ||
32 | |||
33 | + /* Currently a NOP, done for symmetry with load_cleanup() */ | ||
34 | + vfio_multifd_cleanup(vbasedev); | ||
35 | + | ||
36 | /* | ||
37 | * Changing device state from STOP_COPY to STOP can take time. Do it here, | ||
38 | * after migration has completed, so it won't increase downtime. | ||
39 | @@ -XXX,XX +XXX,XX @@ static void vfio_save_state(QEMUFile *f, void *opaque) | ||
40 | static int vfio_load_setup(QEMUFile *f, void *opaque, Error **errp) | ||
41 | { | ||
42 | VFIODevice *vbasedev = opaque; | ||
43 | + VFIOMigration *migration = vbasedev->migration; | ||
44 | + int ret; | ||
45 | |||
46 | - return vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_RESUMING, | ||
47 | - vbasedev->migration->device_state, errp); | ||
48 | + if (!vfio_multifd_setup(vbasedev, true, errp)) { | ||
49 | + return -EINVAL; | ||
50 | + } | ||
51 | + | ||
52 | + ret = vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_RESUMING, | ||
53 | + migration->device_state, errp); | ||
54 | + if (ret) { | ||
55 | + return ret; | ||
56 | + } | ||
57 | + | ||
58 | + return 0; | ||
59 | } | ||
60 | |||
61 | static int vfio_load_cleanup(void *opaque) | ||
62 | { | ||
63 | VFIODevice *vbasedev = opaque; | ||
64 | |||
65 | + vfio_multifd_cleanup(vbasedev); | ||
66 | + | ||
67 | vfio_migration_cleanup(vbasedev); | ||
68 | trace_vfio_load_cleanup(vbasedev->name); | ||
69 | |||
70 | -- | ||
71 | 2.48.1 | ||
72 | |||
73 | diff view generated by jsdifflib |
Deleted patch | |||
---|---|---|---|
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | ||
2 | 1 | ||
3 | The multifd received data needs to be reassembled since device state | ||
4 | packets sent via different multifd channels can arrive out-of-order. | ||
5 | |||
6 | Therefore, each VFIO device state packet carries a header indicating its | ||
7 | position in the stream. | ||
8 | The raw device state data is saved into a VFIOStateBuffer for later | ||
9 | in-order loading into the device. | ||
10 | |||
11 | The last such VFIO device state packet should have | ||
12 | VFIO_DEVICE_STATE_CONFIG_STATE flag set and carry the device config state. | ||
13 | |||
14 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | ||
15 | Reviewed-by: Cédric Le Goater <clg@redhat.com> | ||
16 | Link: https://lore.kernel.org/qemu-devel/e3bff515a8d61c582b94b409eb12a45b1a143a69.1741124640.git.maciej.szmigiero@oracle.com | ||
17 | [ clg: - Reordered savevm_vfio_handlers | ||
18 | - Added load_state_buffer documentation ] | ||
19 | Signed-off-by: Cédric Le Goater <clg@redhat.com> | ||
20 | --- | ||
21 | docs/devel/migration/vfio.rst | 7 ++ | ||
22 | hw/vfio/migration-multifd.h | 3 + | ||
23 | hw/vfio/migration-multifd.c | 163 ++++++++++++++++++++++++++++++++++ | ||
24 | hw/vfio/migration.c | 4 + | ||
25 | hw/vfio/trace-events | 1 + | ||
26 | 5 files changed, 178 insertions(+) | ||
27 | |||
28 | diff --git a/docs/devel/migration/vfio.rst b/docs/devel/migration/vfio.rst | ||
29 | index XXXXXXX..XXXXXXX 100644 | ||
30 | --- a/docs/devel/migration/vfio.rst | ||
31 | +++ b/docs/devel/migration/vfio.rst | ||
32 | @@ -XXX,XX +XXX,XX @@ VFIO implements the device hooks for the iterative approach as follows: | ||
33 | * A ``load_state`` function that loads the config section and the data | ||
34 | sections that are generated by the save functions above. | ||
35 | |||
36 | +* A ``load_state_buffer`` function that loads the device state and the device | ||
37 | + config that arrived via multifd channels. | ||
38 | + It's used only in the multifd mode. | ||
39 | + | ||
40 | * ``cleanup`` functions for both save and load that perform any migration | ||
41 | related cleanup. | ||
42 | |||
43 | @@ -XXX,XX +XXX,XX @@ Live migration resume path | ||
44 | (RESTORE_VM, _ACTIVE, _STOP) | ||
45 | | | ||
46 | For each device, .load_state() is called for that device section data | ||
47 | + transmitted via the main migration channel. | ||
48 | + For data transmitted via multifd channels .load_state_buffer() is called | ||
49 | + instead. | ||
50 | (RESTORE_VM, _ACTIVE, _RESUMING) | ||
51 | | | ||
52 | At the end, .load_cleanup() is called for each device and vCPUs are started | ||
53 | diff --git a/hw/vfio/migration-multifd.h b/hw/vfio/migration-multifd.h | ||
54 | index XXXXXXX..XXXXXXX 100644 | ||
55 | --- a/hw/vfio/migration-multifd.h | ||
56 | +++ b/hw/vfio/migration-multifd.h | ||
57 | @@ -XXX,XX +XXX,XX @@ void vfio_multifd_cleanup(VFIODevice *vbasedev); | ||
58 | bool vfio_multifd_transfer_supported(void); | ||
59 | bool vfio_multifd_transfer_enabled(VFIODevice *vbasedev); | ||
60 | |||
61 | +bool vfio_multifd_load_state_buffer(void *opaque, char *data, size_t data_size, | ||
62 | + Error **errp); | ||
63 | + | ||
64 | #endif | ||
65 | diff --git a/hw/vfio/migration-multifd.c b/hw/vfio/migration-multifd.c | ||
66 | index XXXXXXX..XXXXXXX 100644 | ||
67 | --- a/hw/vfio/migration-multifd.c | ||
68 | +++ b/hw/vfio/migration-multifd.c | ||
69 | @@ -XXX,XX +XXX,XX @@ typedef struct VFIODeviceStatePacket { | ||
70 | uint8_t data[0]; | ||
71 | } QEMU_PACKED VFIODeviceStatePacket; | ||
72 | |||
73 | +/* type safety */ | ||
74 | +typedef struct VFIOStateBuffers { | ||
75 | + GArray *array; | ||
76 | +} VFIOStateBuffers; | ||
77 | + | ||
78 | +typedef struct VFIOStateBuffer { | ||
79 | + bool is_present; | ||
80 | + char *data; | ||
81 | + size_t len; | ||
82 | +} VFIOStateBuffer; | ||
83 | + | ||
84 | typedef struct VFIOMultifd { | ||
85 | + VFIOStateBuffers load_bufs; | ||
86 | + QemuCond load_bufs_buffer_ready_cond; | ||
87 | + QemuMutex load_bufs_mutex; /* Lock order: this lock -> BQL */ | ||
88 | + uint32_t load_buf_idx; | ||
89 | + uint32_t load_buf_idx_last; | ||
90 | } VFIOMultifd; | ||
91 | |||
92 | +static void vfio_state_buffer_clear(gpointer data) | ||
93 | +{ | ||
94 | + VFIOStateBuffer *lb = data; | ||
95 | + | ||
96 | + if (!lb->is_present) { | ||
97 | + return; | ||
98 | + } | ||
99 | + | ||
100 | + g_clear_pointer(&lb->data, g_free); | ||
101 | + lb->is_present = false; | ||
102 | +} | ||
103 | + | ||
104 | +static void vfio_state_buffers_init(VFIOStateBuffers *bufs) | ||
105 | +{ | ||
106 | + bufs->array = g_array_new(FALSE, TRUE, sizeof(VFIOStateBuffer)); | ||
107 | + g_array_set_clear_func(bufs->array, vfio_state_buffer_clear); | ||
108 | +} | ||
109 | + | ||
110 | +static void vfio_state_buffers_destroy(VFIOStateBuffers *bufs) | ||
111 | +{ | ||
112 | + g_clear_pointer(&bufs->array, g_array_unref); | ||
113 | +} | ||
114 | + | ||
115 | +static void vfio_state_buffers_assert_init(VFIOStateBuffers *bufs) | ||
116 | +{ | ||
117 | + assert(bufs->array); | ||
118 | +} | ||
119 | + | ||
120 | +static unsigned int vfio_state_buffers_size_get(VFIOStateBuffers *bufs) | ||
121 | +{ | ||
122 | + return bufs->array->len; | ||
123 | +} | ||
124 | + | ||
125 | +static void vfio_state_buffers_size_set(VFIOStateBuffers *bufs, | ||
126 | + unsigned int size) | ||
127 | +{ | ||
128 | + g_array_set_size(bufs->array, size); | ||
129 | +} | ||
130 | + | ||
131 | +static VFIOStateBuffer *vfio_state_buffers_at(VFIOStateBuffers *bufs, | ||
132 | + unsigned int idx) | ||
133 | +{ | ||
134 | + return &g_array_index(bufs->array, VFIOStateBuffer, idx); | ||
135 | +} | ||
136 | + | ||
137 | +/* called with load_bufs_mutex locked */ | ||
138 | +static bool vfio_load_state_buffer_insert(VFIODevice *vbasedev, | ||
139 | + VFIODeviceStatePacket *packet, | ||
140 | + size_t packet_total_size, | ||
141 | + Error **errp) | ||
142 | +{ | ||
143 | + VFIOMigration *migration = vbasedev->migration; | ||
144 | + VFIOMultifd *multifd = migration->multifd; | ||
145 | + VFIOStateBuffer *lb; | ||
146 | + | ||
147 | + vfio_state_buffers_assert_init(&multifd->load_bufs); | ||
148 | + if (packet->idx >= vfio_state_buffers_size_get(&multifd->load_bufs)) { | ||
149 | + vfio_state_buffers_size_set(&multifd->load_bufs, packet->idx + 1); | ||
150 | + } | ||
151 | + | ||
152 | + lb = vfio_state_buffers_at(&multifd->load_bufs, packet->idx); | ||
153 | + if (lb->is_present) { | ||
154 | + error_setg(errp, "%s: state buffer %" PRIu32 " already filled", | ||
155 | + vbasedev->name, packet->idx); | ||
156 | + return false; | ||
157 | + } | ||
158 | + | ||
159 | + assert(packet->idx >= multifd->load_buf_idx); | ||
160 | + | ||
161 | + lb->data = g_memdup2(&packet->data, packet_total_size - sizeof(*packet)); | ||
162 | + lb->len = packet_total_size - sizeof(*packet); | ||
163 | + lb->is_present = true; | ||
164 | + | ||
165 | + return true; | ||
166 | +} | ||
167 | + | ||
168 | +bool vfio_multifd_load_state_buffer(void *opaque, char *data, size_t data_size, | ||
169 | + Error **errp) | ||
170 | +{ | ||
171 | + VFIODevice *vbasedev = opaque; | ||
172 | + VFIOMigration *migration = vbasedev->migration; | ||
173 | + VFIOMultifd *multifd = migration->multifd; | ||
174 | + VFIODeviceStatePacket *packet = (VFIODeviceStatePacket *)data; | ||
175 | + | ||
176 | + if (!vfio_multifd_transfer_enabled(vbasedev)) { | ||
177 | + error_setg(errp, | ||
178 | + "%s: got device state packet but not doing multifd transfer", | ||
179 | + vbasedev->name); | ||
180 | + return false; | ||
181 | + } | ||
182 | + | ||
183 | + assert(multifd); | ||
184 | + | ||
185 | + if (data_size < sizeof(*packet)) { | ||
186 | + error_setg(errp, "%s: packet too short at %zu (min is %zu)", | ||
187 | + vbasedev->name, data_size, sizeof(*packet)); | ||
188 | + return false; | ||
189 | + } | ||
190 | + | ||
191 | + if (packet->version != VFIO_DEVICE_STATE_PACKET_VER_CURRENT) { | ||
192 | + error_setg(errp, "%s: packet has unknown version %" PRIu32, | ||
193 | + vbasedev->name, packet->version); | ||
194 | + return false; | ||
195 | + } | ||
196 | + | ||
197 | + if (packet->idx == UINT32_MAX) { | ||
198 | + error_setg(errp, "%s: packet index is invalid", vbasedev->name); | ||
199 | + return false; | ||
200 | + } | ||
201 | + | ||
202 | + trace_vfio_load_state_device_buffer_incoming(vbasedev->name, packet->idx); | ||
203 | + | ||
204 | + /* | ||
205 | + * Holding BQL here would violate the lock order and can cause | ||
206 | + * a deadlock once we attempt to lock load_bufs_mutex below. | ||
207 | + */ | ||
208 | + assert(!bql_locked()); | ||
209 | + | ||
210 | + WITH_QEMU_LOCK_GUARD(&multifd->load_bufs_mutex) { | ||
211 | + /* config state packet should be the last one in the stream */ | ||
212 | + if (packet->flags & VFIO_DEVICE_STATE_CONFIG_STATE) { | ||
213 | + multifd->load_buf_idx_last = packet->idx; | ||
214 | + } | ||
215 | + | ||
216 | + if (!vfio_load_state_buffer_insert(vbasedev, packet, data_size, | ||
217 | + errp)) { | ||
218 | + return false; | ||
219 | + } | ||
220 | + | ||
221 | + qemu_cond_signal(&multifd->load_bufs_buffer_ready_cond); | ||
222 | + } | ||
223 | + | ||
224 | + return true; | ||
225 | +} | ||
226 | + | ||
227 | static VFIOMultifd *vfio_multifd_new(void) | ||
228 | { | ||
229 | VFIOMultifd *multifd = g_new(VFIOMultifd, 1); | ||
230 | |||
231 | + vfio_state_buffers_init(&multifd->load_bufs); | ||
232 | + | ||
233 | + qemu_mutex_init(&multifd->load_bufs_mutex); | ||
234 | + | ||
235 | + multifd->load_buf_idx = 0; | ||
236 | + multifd->load_buf_idx_last = UINT32_MAX; | ||
237 | + qemu_cond_init(&multifd->load_bufs_buffer_ready_cond); | ||
238 | + | ||
239 | return multifd; | ||
240 | } | ||
241 | |||
242 | static void vfio_multifd_free(VFIOMultifd *multifd) | ||
243 | { | ||
244 | + vfio_state_buffers_destroy(&multifd->load_bufs); | ||
245 | + qemu_cond_destroy(&multifd->load_bufs_buffer_ready_cond); | ||
246 | + qemu_mutex_destroy(&multifd->load_bufs_mutex); | ||
247 | + | ||
248 | g_free(multifd); | ||
249 | } | ||
250 | |||
251 | diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c | ||
252 | index XXXXXXX..XXXXXXX 100644 | ||
253 | --- a/hw/vfio/migration.c | ||
254 | +++ b/hw/vfio/migration.c | ||
255 | @@ -XXX,XX +XXX,XX @@ static const SaveVMHandlers savevm_vfio_handlers = { | ||
256 | .load_cleanup = vfio_load_cleanup, | ||
257 | .load_state = vfio_load_state, | ||
258 | .switchover_ack_needed = vfio_switchover_ack_needed, | ||
259 | + /* | ||
260 | + * Multifd support | ||
261 | + */ | ||
262 | + .load_state_buffer = vfio_multifd_load_state_buffer, | ||
263 | }; | ||
264 | |||
265 | /* ---------------------------------------------------------------------- */ | ||
266 | diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events | ||
267 | index XXXXXXX..XXXXXXX 100644 | ||
268 | --- a/hw/vfio/trace-events | ||
269 | +++ b/hw/vfio/trace-events | ||
270 | @@ -XXX,XX +XXX,XX @@ vfio_load_device_config_state_start(const char *name) " (%s)" | ||
271 | vfio_load_device_config_state_end(const char *name) " (%s)" | ||
272 | vfio_load_state(const char *name, uint64_t data) " (%s) data 0x%"PRIx64 | ||
273 | vfio_load_state_device_data(const char *name, uint64_t data_size, int ret) " (%s) size %"PRIu64" ret %d" | ||
274 | +vfio_load_state_device_buffer_incoming(const char *name, uint32_t idx) " (%s) idx %"PRIu32 | ||
275 | vfio_migration_realize(const char *name) " (%s)" | ||
276 | vfio_migration_set_device_state(const char *name, const char *state) " (%s) state %s" | ||
277 | vfio_migration_set_state(const char *name, const char *new_state, const char *recover_state) " (%s) new state %s, recover state %s" | ||
278 | -- | ||
279 | 2.48.1 | ||
280 | |||
281 | diff view generated by jsdifflib |
Deleted patch | |||
---|---|---|---|
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | ||
2 | 1 | ||
3 | Add a thread which loads the VFIO device state buffers that were received | ||
4 | via multifd. | ||
5 | |||
6 | Each VFIO device that has multifd device state transfer enabled has one | ||
7 | such thread, which is created using migration core API | ||
8 | qemu_loadvm_start_load_thread(). | ||
9 | |||
10 | Since it's important to finish loading device state transferred via the | ||
11 | main migration channel (via save_live_iterate SaveVMHandler) before | ||
12 | starting loading the data asynchronously transferred via multifd the thread | ||
13 | doing the actual loading of the multifd transferred data is only started | ||
14 | from switchover_start SaveVMHandler. | ||
15 | |||
16 | switchover_start handler is called when MIG_CMD_SWITCHOVER_START | ||
17 | sub-command of QEMU_VM_COMMAND is received via the main migration channel. | ||
18 | |||
19 | This sub-command is only sent after all save_live_iterate data have already | ||
20 | been posted so it is safe to commence loading of the multifd-transferred | ||
21 | device state upon receiving it - loading of save_live_iterate data happens | ||
22 | synchronously in the main migration thread (much like the processing of | ||
23 | MIG_CMD_SWITCHOVER_START) so by the time MIG_CMD_SWITCHOVER_START is | ||
24 | processed all the proceeding data must have already been loaded. | ||
25 | |||
26 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | ||
27 | Reviewed-by: Cédric Le Goater <clg@redhat.com> | ||
28 | Link: https://lore.kernel.org/qemu-devel/9abe612d775aaf42e31646796acd2363c723a57a.1741124640.git.maciej.szmigiero@oracle.com | ||
29 | [ clg: - Reordered savevm_vfio_handlers | ||
30 | - Added switchover_start documentation ] | ||
31 | Signed-off-by: Cédric Le Goater <clg@redhat.com> | ||
32 | --- | ||
33 | docs/devel/migration/vfio.rst | 4 + | ||
34 | hw/vfio/migration-multifd.h | 2 + | ||
35 | hw/vfio/migration-multifd.c | 226 ++++++++++++++++++++++++++++++++++ | ||
36 | hw/vfio/migration.c | 12 ++ | ||
37 | hw/vfio/trace-events | 7 ++ | ||
38 | 5 files changed, 251 insertions(+) | ||
39 | |||
40 | diff --git a/docs/devel/migration/vfio.rst b/docs/devel/migration/vfio.rst | ||
41 | index XXXXXXX..XXXXXXX 100644 | ||
42 | --- a/docs/devel/migration/vfio.rst | ||
43 | +++ b/docs/devel/migration/vfio.rst | ||
44 | @@ -XXX,XX +XXX,XX @@ VFIO implements the device hooks for the iterative approach as follows: | ||
45 | * A ``switchover_ack_needed`` function that checks if the VFIO device uses | ||
46 | "switchover-ack" migration capability when this capability is enabled. | ||
47 | |||
48 | +* A ``switchover_start`` function that in the multifd mode starts a thread that | ||
49 | + reassembles the multifd received data and loads it in-order into the device. | ||
50 | + In the non-multifd mode this function is a NOP. | ||
51 | + | ||
52 | * A ``save_state`` function to save the device config space if it is present. | ||
53 | |||
54 | * A ``save_live_complete_precopy`` function that sets the VFIO device in | ||
55 | diff --git a/hw/vfio/migration-multifd.h b/hw/vfio/migration-multifd.h | ||
56 | index XXXXXXX..XXXXXXX 100644 | ||
57 | --- a/hw/vfio/migration-multifd.h | ||
58 | +++ b/hw/vfio/migration-multifd.h | ||
59 | @@ -XXX,XX +XXX,XX @@ bool vfio_multifd_transfer_enabled(VFIODevice *vbasedev); | ||
60 | bool vfio_multifd_load_state_buffer(void *opaque, char *data, size_t data_size, | ||
61 | Error **errp); | ||
62 | |||
63 | +int vfio_multifd_switchover_start(VFIODevice *vbasedev); | ||
64 | + | ||
65 | #endif | ||
66 | diff --git a/hw/vfio/migration-multifd.c b/hw/vfio/migration-multifd.c | ||
67 | index XXXXXXX..XXXXXXX 100644 | ||
68 | --- a/hw/vfio/migration-multifd.c | ||
69 | +++ b/hw/vfio/migration-multifd.c | ||
70 | @@ -XXX,XX +XXX,XX @@ typedef struct VFIOStateBuffer { | ||
71 | } VFIOStateBuffer; | ||
72 | |||
73 | typedef struct VFIOMultifd { | ||
74 | + bool load_bufs_thread_running; | ||
75 | + bool load_bufs_thread_want_exit; | ||
76 | + | ||
77 | VFIOStateBuffers load_bufs; | ||
78 | QemuCond load_bufs_buffer_ready_cond; | ||
79 | + QemuCond load_bufs_thread_finished_cond; | ||
80 | QemuMutex load_bufs_mutex; /* Lock order: this lock -> BQL */ | ||
81 | uint32_t load_buf_idx; | ||
82 | uint32_t load_buf_idx_last; | ||
83 | @@ -XXX,XX +XXX,XX @@ bool vfio_multifd_load_state_buffer(void *opaque, char *data, size_t data_size, | ||
84 | return true; | ||
85 | } | ||
86 | |||
87 | +static bool vfio_load_bufs_thread_load_config(VFIODevice *vbasedev, | ||
88 | + Error **errp) | ||
89 | +{ | ||
90 | + error_setg(errp, "not yet there"); | ||
91 | + return false; | ||
92 | +} | ||
93 | + | ||
94 | +static VFIOStateBuffer *vfio_load_state_buffer_get(VFIOMultifd *multifd) | ||
95 | +{ | ||
96 | + VFIOStateBuffer *lb; | ||
97 | + unsigned int bufs_len; | ||
98 | + | ||
99 | + bufs_len = vfio_state_buffers_size_get(&multifd->load_bufs); | ||
100 | + if (multifd->load_buf_idx >= bufs_len) { | ||
101 | + assert(multifd->load_buf_idx == bufs_len); | ||
102 | + return NULL; | ||
103 | + } | ||
104 | + | ||
105 | + lb = vfio_state_buffers_at(&multifd->load_bufs, | ||
106 | + multifd->load_buf_idx); | ||
107 | + if (!lb->is_present) { | ||
108 | + return NULL; | ||
109 | + } | ||
110 | + | ||
111 | + return lb; | ||
112 | +} | ||
113 | + | ||
114 | +static bool vfio_load_state_buffer_write(VFIODevice *vbasedev, | ||
115 | + VFIOStateBuffer *lb, | ||
116 | + Error **errp) | ||
117 | +{ | ||
118 | + VFIOMigration *migration = vbasedev->migration; | ||
119 | + VFIOMultifd *multifd = migration->multifd; | ||
120 | + g_autofree char *buf = NULL; | ||
121 | + char *buf_cur; | ||
122 | + size_t buf_len; | ||
123 | + | ||
124 | + if (!lb->len) { | ||
125 | + return true; | ||
126 | + } | ||
127 | + | ||
128 | + trace_vfio_load_state_device_buffer_load_start(vbasedev->name, | ||
129 | + multifd->load_buf_idx); | ||
130 | + | ||
131 | + /* lb might become re-allocated when we drop the lock */ | ||
132 | + buf = g_steal_pointer(&lb->data); | ||
133 | + buf_cur = buf; | ||
134 | + buf_len = lb->len; | ||
135 | + while (buf_len > 0) { | ||
136 | + ssize_t wr_ret; | ||
137 | + int errno_save; | ||
138 | + | ||
139 | + /* | ||
140 | + * Loading data to the device takes a while, | ||
141 | + * drop the lock during this process. | ||
142 | + */ | ||
143 | + qemu_mutex_unlock(&multifd->load_bufs_mutex); | ||
144 | + wr_ret = write(migration->data_fd, buf_cur, buf_len); | ||
145 | + errno_save = errno; | ||
146 | + qemu_mutex_lock(&multifd->load_bufs_mutex); | ||
147 | + | ||
148 | + if (wr_ret < 0) { | ||
149 | + error_setg(errp, | ||
150 | + "%s: writing state buffer %" PRIu32 " failed: %d", | ||
151 | + vbasedev->name, multifd->load_buf_idx, errno_save); | ||
152 | + return false; | ||
153 | + } | ||
154 | + | ||
155 | + assert(wr_ret <= buf_len); | ||
156 | + buf_len -= wr_ret; | ||
157 | + buf_cur += wr_ret; | ||
158 | + } | ||
159 | + | ||
160 | + trace_vfio_load_state_device_buffer_load_end(vbasedev->name, | ||
161 | + multifd->load_buf_idx); | ||
162 | + | ||
163 | + return true; | ||
164 | +} | ||
165 | + | ||
166 | +static bool vfio_load_bufs_thread_want_exit(VFIOMultifd *multifd, | ||
167 | + bool *should_quit) | ||
168 | +{ | ||
169 | + return multifd->load_bufs_thread_want_exit || qatomic_read(should_quit); | ||
170 | +} | ||
171 | + | ||
172 | +/* | ||
173 | + * This thread is spawned by vfio_multifd_switchover_start() which gets | ||
174 | + * called upon encountering the switchover point marker in main migration | ||
175 | + * stream. | ||
176 | + * | ||
177 | + * It exits after either: | ||
178 | + * * completing loading the remaining device state and device config, OR: | ||
179 | + * * encountering some error while doing the above, OR: | ||
180 | + * * being forcefully aborted by the migration core by it setting should_quit | ||
181 | + * or by vfio_load_cleanup_load_bufs_thread() setting | ||
182 | + * multifd->load_bufs_thread_want_exit. | ||
183 | + */ | ||
184 | +static bool vfio_load_bufs_thread(void *opaque, bool *should_quit, Error **errp) | ||
185 | +{ | ||
186 | + VFIODevice *vbasedev = opaque; | ||
187 | + VFIOMigration *migration = vbasedev->migration; | ||
188 | + VFIOMultifd *multifd = migration->multifd; | ||
189 | + bool ret = false; | ||
190 | + | ||
191 | + trace_vfio_load_bufs_thread_start(vbasedev->name); | ||
192 | + | ||
193 | + assert(multifd); | ||
194 | + QEMU_LOCK_GUARD(&multifd->load_bufs_mutex); | ||
195 | + | ||
196 | + assert(multifd->load_bufs_thread_running); | ||
197 | + | ||
198 | + while (true) { | ||
199 | + VFIOStateBuffer *lb; | ||
200 | + | ||
201 | + /* | ||
202 | + * Always check cancellation first after the buffer_ready wait below in | ||
203 | + * case that cond was signalled by vfio_load_cleanup_load_bufs_thread(). | ||
204 | + */ | ||
205 | + if (vfio_load_bufs_thread_want_exit(multifd, should_quit)) { | ||
206 | + error_setg(errp, "operation cancelled"); | ||
207 | + goto thread_exit; | ||
208 | + } | ||
209 | + | ||
210 | + assert(multifd->load_buf_idx <= multifd->load_buf_idx_last); | ||
211 | + | ||
212 | + lb = vfio_load_state_buffer_get(multifd); | ||
213 | + if (!lb) { | ||
214 | + trace_vfio_load_state_device_buffer_starved(vbasedev->name, | ||
215 | + multifd->load_buf_idx); | ||
216 | + qemu_cond_wait(&multifd->load_bufs_buffer_ready_cond, | ||
217 | + &multifd->load_bufs_mutex); | ||
218 | + continue; | ||
219 | + } | ||
220 | + | ||
221 | + if (multifd->load_buf_idx == multifd->load_buf_idx_last) { | ||
222 | + break; | ||
223 | + } | ||
224 | + | ||
225 | + if (multifd->load_buf_idx == 0) { | ||
226 | + trace_vfio_load_state_device_buffer_start(vbasedev->name); | ||
227 | + } | ||
228 | + | ||
229 | + if (!vfio_load_state_buffer_write(vbasedev, lb, errp)) { | ||
230 | + goto thread_exit; | ||
231 | + } | ||
232 | + | ||
233 | + if (multifd->load_buf_idx == multifd->load_buf_idx_last - 1) { | ||
234 | + trace_vfio_load_state_device_buffer_end(vbasedev->name); | ||
235 | + } | ||
236 | + | ||
237 | + multifd->load_buf_idx++; | ||
238 | + } | ||
239 | + | ||
240 | + if (!vfio_load_bufs_thread_load_config(vbasedev, errp)) { | ||
241 | + goto thread_exit; | ||
242 | + } | ||
243 | + | ||
244 | + ret = true; | ||
245 | + | ||
246 | +thread_exit: | ||
247 | + /* | ||
248 | + * Notify possibly waiting vfio_load_cleanup_load_bufs_thread() that | ||
249 | + * this thread is exiting. | ||
250 | + */ | ||
251 | + multifd->load_bufs_thread_running = false; | ||
252 | + qemu_cond_signal(&multifd->load_bufs_thread_finished_cond); | ||
253 | + | ||
254 | + trace_vfio_load_bufs_thread_end(vbasedev->name); | ||
255 | + | ||
256 | + return ret; | ||
257 | +} | ||
258 | + | ||
259 | static VFIOMultifd *vfio_multifd_new(void) | ||
260 | { | ||
261 | VFIOMultifd *multifd = g_new(VFIOMultifd, 1); | ||
262 | @@ -XXX,XX +XXX,XX @@ static VFIOMultifd *vfio_multifd_new(void) | ||
263 | multifd->load_buf_idx_last = UINT32_MAX; | ||
264 | qemu_cond_init(&multifd->load_bufs_buffer_ready_cond); | ||
265 | |||
266 | + multifd->load_bufs_thread_running = false; | ||
267 | + multifd->load_bufs_thread_want_exit = false; | ||
268 | + qemu_cond_init(&multifd->load_bufs_thread_finished_cond); | ||
269 | + | ||
270 | return multifd; | ||
271 | } | ||
272 | |||
273 | +/* | ||
274 | + * Terminates vfio_load_bufs_thread by setting | ||
275 | + * multifd->load_bufs_thread_want_exit and signalling all the conditions | ||
276 | + * the thread could be blocked on. | ||
277 | + * | ||
278 | + * Waits for the thread to signal that it had finished. | ||
279 | + */ | ||
280 | +static void vfio_load_cleanup_load_bufs_thread(VFIOMultifd *multifd) | ||
281 | +{ | ||
282 | + /* The lock order is load_bufs_mutex -> BQL so unlock BQL here first */ | ||
283 | + bql_unlock(); | ||
284 | + WITH_QEMU_LOCK_GUARD(&multifd->load_bufs_mutex) { | ||
285 | + while (multifd->load_bufs_thread_running) { | ||
286 | + multifd->load_bufs_thread_want_exit = true; | ||
287 | + | ||
288 | + qemu_cond_signal(&multifd->load_bufs_buffer_ready_cond); | ||
289 | + qemu_cond_wait(&multifd->load_bufs_thread_finished_cond, | ||
290 | + &multifd->load_bufs_mutex); | ||
291 | + } | ||
292 | + } | ||
293 | + bql_lock(); | ||
294 | +} | ||
295 | + | ||
296 | static void vfio_multifd_free(VFIOMultifd *multifd) | ||
297 | { | ||
298 | + vfio_load_cleanup_load_bufs_thread(multifd); | ||
299 | + | ||
300 | + qemu_cond_destroy(&multifd->load_bufs_thread_finished_cond); | ||
301 | vfio_state_buffers_destroy(&multifd->load_bufs); | ||
302 | qemu_cond_destroy(&multifd->load_bufs_buffer_ready_cond); | ||
303 | qemu_mutex_destroy(&multifd->load_bufs_mutex); | ||
304 | @@ -XXX,XX +XXX,XX @@ bool vfio_multifd_setup(VFIODevice *vbasedev, bool alloc_multifd, Error **errp) | ||
305 | |||
306 | return true; | ||
307 | } | ||
308 | + | ||
309 | +int vfio_multifd_switchover_start(VFIODevice *vbasedev) | ||
310 | +{ | ||
311 | + VFIOMigration *migration = vbasedev->migration; | ||
312 | + VFIOMultifd *multifd = migration->multifd; | ||
313 | + | ||
314 | + assert(multifd); | ||
315 | + | ||
316 | + /* The lock order is load_bufs_mutex -> BQL so unlock BQL here first */ | ||
317 | + bql_unlock(); | ||
318 | + WITH_QEMU_LOCK_GUARD(&multifd->load_bufs_mutex) { | ||
319 | + assert(!multifd->load_bufs_thread_running); | ||
320 | + multifd->load_bufs_thread_running = true; | ||
321 | + } | ||
322 | + bql_lock(); | ||
323 | + | ||
324 | + qemu_loadvm_start_load_thread(vfio_load_bufs_thread, vbasedev); | ||
325 | + | ||
326 | + return 0; | ||
327 | +} | ||
328 | diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c | ||
329 | index XXXXXXX..XXXXXXX 100644 | ||
330 | --- a/hw/vfio/migration.c | ||
331 | +++ b/hw/vfio/migration.c | ||
332 | @@ -XXX,XX +XXX,XX @@ static bool vfio_switchover_ack_needed(void *opaque) | ||
333 | return vfio_precopy_supported(vbasedev); | ||
334 | } | ||
335 | |||
336 | +static int vfio_switchover_start(void *opaque) | ||
337 | +{ | ||
338 | + VFIODevice *vbasedev = opaque; | ||
339 | + | ||
340 | + if (vfio_multifd_transfer_enabled(vbasedev)) { | ||
341 | + return vfio_multifd_switchover_start(vbasedev); | ||
342 | + } | ||
343 | + | ||
344 | + return 0; | ||
345 | +} | ||
346 | + | ||
347 | static const SaveVMHandlers savevm_vfio_handlers = { | ||
348 | .save_prepare = vfio_save_prepare, | ||
349 | .save_setup = vfio_save_setup, | ||
350 | @@ -XXX,XX +XXX,XX @@ static const SaveVMHandlers savevm_vfio_handlers = { | ||
351 | * Multifd support | ||
352 | */ | ||
353 | .load_state_buffer = vfio_multifd_load_state_buffer, | ||
354 | + .switchover_start = vfio_switchover_start, | ||
355 | }; | ||
356 | |||
357 | /* ---------------------------------------------------------------------- */ | ||
358 | diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events | ||
359 | index XXXXXXX..XXXXXXX 100644 | ||
360 | --- a/hw/vfio/trace-events | ||
361 | +++ b/hw/vfio/trace-events | ||
362 | @@ -XXX,XX +XXX,XX @@ vfio_display_edid_update(uint32_t prefx, uint32_t prefy) "%ux%u" | ||
363 | vfio_display_edid_write_error(void) "" | ||
364 | |||
365 | # migration.c | ||
366 | +vfio_load_bufs_thread_start(const char *name) " (%s)" | ||
367 | +vfio_load_bufs_thread_end(const char *name) " (%s)" | ||
368 | vfio_load_cleanup(const char *name) " (%s)" | ||
369 | vfio_load_device_config_state_start(const char *name) " (%s)" | ||
370 | vfio_load_device_config_state_end(const char *name) " (%s)" | ||
371 | vfio_load_state(const char *name, uint64_t data) " (%s) data 0x%"PRIx64 | ||
372 | vfio_load_state_device_data(const char *name, uint64_t data_size, int ret) " (%s) size %"PRIu64" ret %d" | ||
373 | vfio_load_state_device_buffer_incoming(const char *name, uint32_t idx) " (%s) idx %"PRIu32 | ||
374 | +vfio_load_state_device_buffer_start(const char *name) " (%s)" | ||
375 | +vfio_load_state_device_buffer_starved(const char *name, uint32_t idx) " (%s) idx %"PRIu32 | ||
376 | +vfio_load_state_device_buffer_load_start(const char *name, uint32_t idx) " (%s) idx %"PRIu32 | ||
377 | +vfio_load_state_device_buffer_load_end(const char *name, uint32_t idx) " (%s) idx %"PRIu32 | ||
378 | +vfio_load_state_device_buffer_end(const char *name) " (%s)" | ||
379 | vfio_migration_realize(const char *name) " (%s)" | ||
380 | vfio_migration_set_device_state(const char *name, const char *state) " (%s) state %s" | ||
381 | vfio_migration_set_state(const char *name, const char *new_state, const char *recover_state) " (%s) new state %s, recover state %s" | ||
382 | -- | ||
383 | 2.48.1 | ||
384 | |||
385 | diff view generated by jsdifflib |
Deleted patch | |||
---|---|---|---|
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | ||
2 | 1 | ||
3 | Automatic memory management helps avoid memory safety issues. | ||
4 | |||
5 | Reviewed-by: Fabiano Rosas <farosas@suse.de> | ||
6 | Reviewed-by: Peter Xu <peterx@redhat.com> | ||
7 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | ||
8 | Link: https://lore.kernel.org/qemu-devel/2fd01d773a783d572dcf538a064a98cc09e75c12.1741124640.git.maciej.szmigiero@oracle.com | ||
9 | Signed-off-by: Cédric Le Goater <clg@redhat.com> | ||
10 | --- | ||
11 | migration/qemu-file.h | 2 ++ | ||
12 | 1 file changed, 2 insertions(+) | ||
13 | |||
14 | diff --git a/migration/qemu-file.h b/migration/qemu-file.h | ||
15 | index XXXXXXX..XXXXXXX 100644 | ||
16 | --- a/migration/qemu-file.h | ||
17 | +++ b/migration/qemu-file.h | ||
18 | @@ -XXX,XX +XXX,XX @@ QEMUFile *qemu_file_new_input(QIOChannel *ioc); | ||
19 | QEMUFile *qemu_file_new_output(QIOChannel *ioc); | ||
20 | int qemu_fclose(QEMUFile *f); | ||
21 | |||
22 | +G_DEFINE_AUTOPTR_CLEANUP_FUNC(QEMUFile, qemu_fclose) | ||
23 | + | ||
24 | /* | ||
25 | * qemu_file_transferred: | ||
26 | * | ||
27 | -- | ||
28 | 2.48.1 | ||
29 | |||
30 | diff view generated by jsdifflib |
Deleted patch | |||
---|---|---|---|
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | ||
2 | 1 | ||
3 | Load device config received via multifd using the existing machinery | ||
4 | behind vfio_load_device_config_state(). | ||
5 | |||
6 | Also, make sure to process the relevant main migration channel flags. | ||
7 | |||
8 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | ||
9 | Reviewed-by: Cédric Le Goater <clg@redhat.com> | ||
10 | Link: https://lore.kernel.org/qemu-devel/5dbd3f3703ec1097da2cf82a7262233452146fee.1741124640.git.maciej.szmigiero@oracle.com | ||
11 | Signed-off-by: Cédric Le Goater <clg@redhat.com> | ||
12 | --- | ||
13 | include/hw/vfio/vfio-common.h | 2 ++ | ||
14 | hw/vfio/migration-multifd.c | 49 +++++++++++++++++++++++++++++++++-- | ||
15 | hw/vfio/migration.c | 9 ++++++- | ||
16 | 3 files changed, 57 insertions(+), 3 deletions(-) | ||
17 | |||
18 | diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h | ||
19 | index XXXXXXX..XXXXXXX 100644 | ||
20 | --- a/include/hw/vfio/vfio-common.h | ||
21 | +++ b/include/hw/vfio/vfio-common.h | ||
22 | @@ -XXX,XX +XXX,XX @@ void vfio_mig_add_bytes_transferred(unsigned long val); | ||
23 | bool vfio_device_state_is_running(VFIODevice *vbasedev); | ||
24 | bool vfio_device_state_is_precopy(VFIODevice *vbasedev); | ||
25 | |||
26 | +int vfio_load_device_config_state(QEMUFile *f, void *opaque); | ||
27 | + | ||
28 | #ifdef CONFIG_LINUX | ||
29 | int vfio_get_region_info(VFIODevice *vbasedev, int index, | ||
30 | struct vfio_region_info **info); | ||
31 | diff --git a/hw/vfio/migration-multifd.c b/hw/vfio/migration-multifd.c | ||
32 | index XXXXXXX..XXXXXXX 100644 | ||
33 | --- a/hw/vfio/migration-multifd.c | ||
34 | +++ b/hw/vfio/migration-multifd.c | ||
35 | @@ -XXX,XX +XXX,XX @@ | ||
36 | #include "qemu/lockable.h" | ||
37 | #include "qemu/main-loop.h" | ||
38 | #include "qemu/thread.h" | ||
39 | +#include "io/channel-buffer.h" | ||
40 | #include "migration/qemu-file.h" | ||
41 | #include "migration-multifd.h" | ||
42 | #include "trace.h" | ||
43 | @@ -XXX,XX +XXX,XX @@ bool vfio_multifd_load_state_buffer(void *opaque, char *data, size_t data_size, | ||
44 | static bool vfio_load_bufs_thread_load_config(VFIODevice *vbasedev, | ||
45 | Error **errp) | ||
46 | { | ||
47 | - error_setg(errp, "not yet there"); | ||
48 | - return false; | ||
49 | + VFIOMigration *migration = vbasedev->migration; | ||
50 | + VFIOMultifd *multifd = migration->multifd; | ||
51 | + VFIOStateBuffer *lb; | ||
52 | + g_autoptr(QIOChannelBuffer) bioc = NULL; | ||
53 | + g_autoptr(QEMUFile) f_out = NULL, f_in = NULL; | ||
54 | + uint64_t mig_header; | ||
55 | + int ret; | ||
56 | + | ||
57 | + assert(multifd->load_buf_idx == multifd->load_buf_idx_last); | ||
58 | + lb = vfio_state_buffers_at(&multifd->load_bufs, multifd->load_buf_idx); | ||
59 | + assert(lb->is_present); | ||
60 | + | ||
61 | + bioc = qio_channel_buffer_new(lb->len); | ||
62 | + qio_channel_set_name(QIO_CHANNEL(bioc), "vfio-device-config-load"); | ||
63 | + | ||
64 | + f_out = qemu_file_new_output(QIO_CHANNEL(bioc)); | ||
65 | + qemu_put_buffer(f_out, (uint8_t *)lb->data, lb->len); | ||
66 | + | ||
67 | + ret = qemu_fflush(f_out); | ||
68 | + if (ret) { | ||
69 | + error_setg(errp, "%s: load config state flush failed: %d", | ||
70 | + vbasedev->name, ret); | ||
71 | + return false; | ||
72 | + } | ||
73 | + | ||
74 | + qio_channel_io_seek(QIO_CHANNEL(bioc), 0, 0, NULL); | ||
75 | + f_in = qemu_file_new_input(QIO_CHANNEL(bioc)); | ||
76 | + | ||
77 | + mig_header = qemu_get_be64(f_in); | ||
78 | + if (mig_header != VFIO_MIG_FLAG_DEV_CONFIG_STATE) { | ||
79 | + error_setg(errp, "%s: expected FLAG_DEV_CONFIG_STATE but got %" PRIx64, | ||
80 | + vbasedev->name, mig_header); | ||
81 | + return false; | ||
82 | + } | ||
83 | + | ||
84 | + bql_lock(); | ||
85 | + ret = vfio_load_device_config_state(f_in, vbasedev); | ||
86 | + bql_unlock(); | ||
87 | + | ||
88 | + if (ret < 0) { | ||
89 | + error_setg(errp, "%s: vfio_load_device_config_state() failed: %d", | ||
90 | + vbasedev->name, ret); | ||
91 | + return false; | ||
92 | + } | ||
93 | + | ||
94 | + return true; | ||
95 | } | ||
96 | |||
97 | static VFIOStateBuffer *vfio_load_state_buffer_get(VFIOMultifd *multifd) | ||
98 | diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c | ||
99 | index XXXXXXX..XXXXXXX 100644 | ||
100 | --- a/hw/vfio/migration.c | ||
101 | +++ b/hw/vfio/migration.c | ||
102 | @@ -XXX,XX +XXX,XX @@ static int vfio_save_device_config_state(QEMUFile *f, void *opaque, | ||
103 | return ret; | ||
104 | } | ||
105 | |||
106 | -static int vfio_load_device_config_state(QEMUFile *f, void *opaque) | ||
107 | +int vfio_load_device_config_state(QEMUFile *f, void *opaque) | ||
108 | { | ||
109 | VFIODevice *vbasedev = opaque; | ||
110 | uint64_t data; | ||
111 | @@ -XXX,XX +XXX,XX @@ static int vfio_load_state(QEMUFile *f, void *opaque, int version_id) | ||
112 | switch (data) { | ||
113 | case VFIO_MIG_FLAG_DEV_CONFIG_STATE: | ||
114 | { | ||
115 | + if (vfio_multifd_transfer_enabled(vbasedev)) { | ||
116 | + error_report("%s: got DEV_CONFIG_STATE in main migration " | ||
117 | + "channel but doing multifd transfer", | ||
118 | + vbasedev->name); | ||
119 | + return -EINVAL; | ||
120 | + } | ||
121 | + | ||
122 | return vfio_load_device_config_state(f, opaque); | ||
123 | } | ||
124 | case VFIO_MIG_FLAG_DEV_SETUP_STATE: | ||
125 | -- | ||
126 | 2.48.1 | ||
127 | |||
128 | diff view generated by jsdifflib |
Deleted patch | |||
---|---|---|---|
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | ||
2 | 1 | ||
3 | Implement the multifd device state transfer via additional per-device | ||
4 | thread inside save_live_complete_precopy_thread handler. | ||
5 | |||
6 | Switch between doing the data transfer in the new handler and doing it | ||
7 | in the old save_state handler depending if VFIO multifd transfer is enabled | ||
8 | or not. | ||
9 | |||
10 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | ||
11 | Reviewed-by: Cédric Le Goater <clg@redhat.com> | ||
12 | Link: https://lore.kernel.org/qemu-devel/4d727e2e0435e0022d50004e474077632830e08d.1741124640.git.maciej.szmigiero@oracle.com | ||
13 | [ clg: - Reordered savevm_vfio_handlers | ||
14 | - Updated save_live_complete_precopy* documentation ] | ||
15 | Signed-off-by: Cédric Le Goater <clg@redhat.com> | ||
16 | --- | ||
17 | docs/devel/migration/vfio.rst | 19 ++++- | ||
18 | hw/vfio/migration-multifd.h | 6 ++ | ||
19 | include/hw/vfio/vfio-common.h | 6 ++ | ||
20 | hw/vfio/migration-multifd.c | 142 ++++++++++++++++++++++++++++++++++ | ||
21 | hw/vfio/migration.c | 22 ++++-- | ||
22 | hw/vfio/trace-events | 2 + | ||
23 | 6 files changed, 189 insertions(+), 8 deletions(-) | ||
24 | |||
25 | diff --git a/docs/devel/migration/vfio.rst b/docs/devel/migration/vfio.rst | ||
26 | index XXXXXXX..XXXXXXX 100644 | ||
27 | --- a/docs/devel/migration/vfio.rst | ||
28 | +++ b/docs/devel/migration/vfio.rst | ||
29 | @@ -XXX,XX +XXX,XX @@ VFIO implements the device hooks for the iterative approach as follows: | ||
30 | reassembles the multifd received data and loads it in-order into the device. | ||
31 | In the non-multifd mode this function is a NOP. | ||
32 | |||
33 | -* A ``save_state`` function to save the device config space if it is present. | ||
34 | +* A ``save_state`` function to save the device config space if it is present | ||
35 | + in the non-multifd mode. | ||
36 | + In the multifd mode it just emits either a dummy EOS marker. | ||
37 | |||
38 | * A ``save_live_complete_precopy`` function that sets the VFIO device in | ||
39 | _STOP_COPY state and iteratively copies the data for the VFIO device until | ||
40 | the vendor driver indicates that no data remains. | ||
41 | + In the multifd mode it just emits a dummy EOS marker. | ||
42 | + | ||
43 | +* A ``save_live_complete_precopy_thread`` function that in the multifd mode | ||
44 | + provides thread handler performing multifd device state transfer. | ||
45 | + It sets the VFIO device to _STOP_COPY state, iteratively reads the data | ||
46 | + from the VFIO device and queues it for multifd transmission until the vendor | ||
47 | + driver indicates that no data remains. | ||
48 | + After that, it saves the device config space and queues it for multifd | ||
49 | + transfer too. | ||
50 | + In the non-multifd mode this thread is a NOP. | ||
51 | |||
52 | * A ``load_state`` function that loads the config section and the data | ||
53 | sections that are generated by the save functions above. | ||
54 | @@ -XXX,XX +XXX,XX @@ Live migration save path | ||
55 | Then the VFIO device is put in _STOP_COPY state | ||
56 | (FINISH_MIGRATE, _ACTIVE, _STOP_COPY) | ||
57 | .save_live_complete_precopy() is called for each active device | ||
58 | - For the VFIO device, iterate in .save_live_complete_precopy() until | ||
59 | + For the VFIO device: in the non-multifd mode iterate in | ||
60 | + .save_live_complete_precopy() until | ||
61 | pending data is 0 | ||
62 | + In the multifd mode this iteration is done in | ||
63 | + .save_live_complete_precopy_thread() instead. | ||
64 | | | ||
65 | (POSTMIGRATE, _COMPLETED, _STOP_COPY) | ||
66 | Migraton thread schedules cleanup bottom half and exits | ||
67 | diff --git a/hw/vfio/migration-multifd.h b/hw/vfio/migration-multifd.h | ||
68 | index XXXXXXX..XXXXXXX 100644 | ||
69 | --- a/hw/vfio/migration-multifd.h | ||
70 | +++ b/hw/vfio/migration-multifd.h | ||
71 | @@ -XXX,XX +XXX,XX @@ bool vfio_multifd_transfer_enabled(VFIODevice *vbasedev); | ||
72 | bool vfio_multifd_load_state_buffer(void *opaque, char *data, size_t data_size, | ||
73 | Error **errp); | ||
74 | |||
75 | +void vfio_multifd_emit_dummy_eos(VFIODevice *vbasedev, QEMUFile *f); | ||
76 | + | ||
77 | +bool | ||
78 | +vfio_multifd_save_complete_precopy_thread(SaveLiveCompletePrecopyThreadData *d, | ||
79 | + Error **errp); | ||
80 | + | ||
81 | int vfio_multifd_switchover_start(VFIODevice *vbasedev); | ||
82 | |||
83 | #endif | ||
84 | diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h | ||
85 | index XXXXXXX..XXXXXXX 100644 | ||
86 | --- a/include/hw/vfio/vfio-common.h | ||
87 | +++ b/include/hw/vfio/vfio-common.h | ||
88 | @@ -XXX,XX +XXX,XX @@ void vfio_mig_add_bytes_transferred(unsigned long val); | ||
89 | bool vfio_device_state_is_running(VFIODevice *vbasedev); | ||
90 | bool vfio_device_state_is_precopy(VFIODevice *vbasedev); | ||
91 | |||
92 | +int vfio_save_device_config_state(QEMUFile *f, void *opaque, Error **errp); | ||
93 | int vfio_load_device_config_state(QEMUFile *f, void *opaque); | ||
94 | |||
95 | #ifdef CONFIG_LINUX | ||
96 | @@ -XXX,XX +XXX,XX @@ struct vfio_info_cap_header * | ||
97 | vfio_get_device_info_cap(struct vfio_device_info *info, uint16_t id); | ||
98 | struct vfio_info_cap_header * | ||
99 | vfio_get_cap(void *ptr, uint32_t cap_offset, uint16_t id); | ||
100 | + | ||
101 | +int vfio_migration_set_state(VFIODevice *vbasedev, | ||
102 | + enum vfio_device_mig_state new_state, | ||
103 | + enum vfio_device_mig_state recover_state, | ||
104 | + Error **errp); | ||
105 | #endif | ||
106 | |||
107 | bool vfio_migration_realize(VFIODevice *vbasedev, Error **errp); | ||
108 | diff --git a/hw/vfio/migration-multifd.c b/hw/vfio/migration-multifd.c | ||
109 | index XXXXXXX..XXXXXXX 100644 | ||
110 | --- a/hw/vfio/migration-multifd.c | ||
111 | +++ b/hw/vfio/migration-multifd.c | ||
112 | @@ -XXX,XX +XXX,XX @@ bool vfio_multifd_setup(VFIODevice *vbasedev, bool alloc_multifd, Error **errp) | ||
113 | return true; | ||
114 | } | ||
115 | |||
116 | +void vfio_multifd_emit_dummy_eos(VFIODevice *vbasedev, QEMUFile *f) | ||
117 | +{ | ||
118 | + assert(vfio_multifd_transfer_enabled(vbasedev)); | ||
119 | + | ||
120 | + /* | ||
121 | + * Emit dummy NOP data on the main migration channel since the actual | ||
122 | + * device state transfer is done via multifd channels. | ||
123 | + */ | ||
124 | + qemu_put_be64(f, VFIO_MIG_FLAG_END_OF_STATE); | ||
125 | +} | ||
126 | + | ||
127 | +static bool | ||
128 | +vfio_save_complete_precopy_thread_config_state(VFIODevice *vbasedev, | ||
129 | + char *idstr, | ||
130 | + uint32_t instance_id, | ||
131 | + uint32_t idx, | ||
132 | + Error **errp) | ||
133 | +{ | ||
134 | + g_autoptr(QIOChannelBuffer) bioc = NULL; | ||
135 | + g_autoptr(QEMUFile) f = NULL; | ||
136 | + int ret; | ||
137 | + g_autofree VFIODeviceStatePacket *packet = NULL; | ||
138 | + size_t packet_len; | ||
139 | + | ||
140 | + bioc = qio_channel_buffer_new(0); | ||
141 | + qio_channel_set_name(QIO_CHANNEL(bioc), "vfio-device-config-save"); | ||
142 | + | ||
143 | + f = qemu_file_new_output(QIO_CHANNEL(bioc)); | ||
144 | + | ||
145 | + if (vfio_save_device_config_state(f, vbasedev, errp)) { | ||
146 | + return false; | ||
147 | + } | ||
148 | + | ||
149 | + ret = qemu_fflush(f); | ||
150 | + if (ret) { | ||
151 | + error_setg(errp, "%s: save config state flush failed: %d", | ||
152 | + vbasedev->name, ret); | ||
153 | + return false; | ||
154 | + } | ||
155 | + | ||
156 | + packet_len = sizeof(*packet) + bioc->usage; | ||
157 | + packet = g_malloc0(packet_len); | ||
158 | + packet->version = VFIO_DEVICE_STATE_PACKET_VER_CURRENT; | ||
159 | + packet->idx = idx; | ||
160 | + packet->flags = VFIO_DEVICE_STATE_CONFIG_STATE; | ||
161 | + memcpy(&packet->data, bioc->data, bioc->usage); | ||
162 | + | ||
163 | + if (!multifd_queue_device_state(idstr, instance_id, | ||
164 | + (char *)packet, packet_len)) { | ||
165 | + error_setg(errp, "%s: multifd config data queuing failed", | ||
166 | + vbasedev->name); | ||
167 | + return false; | ||
168 | + } | ||
169 | + | ||
170 | + vfio_mig_add_bytes_transferred(packet_len); | ||
171 | + | ||
172 | + return true; | ||
173 | +} | ||
174 | + | ||
175 | +/* | ||
176 | + * This thread is spawned by the migration core directly via | ||
177 | + * .save_live_complete_precopy_thread SaveVMHandler. | ||
178 | + * | ||
179 | + * It exits after either: | ||
180 | + * * completing saving the remaining device state and device config, OR: | ||
181 | + * * encountering some error while doing the above, OR: | ||
182 | + * * being forcefully aborted by the migration core by | ||
183 | + * multifd_device_state_save_thread_should_exit() returning true. | ||
184 | + */ | ||
185 | +bool | ||
186 | +vfio_multifd_save_complete_precopy_thread(SaveLiveCompletePrecopyThreadData *d, | ||
187 | + Error **errp) | ||
188 | +{ | ||
189 | + VFIODevice *vbasedev = d->handler_opaque; | ||
190 | + VFIOMigration *migration = vbasedev->migration; | ||
191 | + bool ret = false; | ||
192 | + g_autofree VFIODeviceStatePacket *packet = NULL; | ||
193 | + uint32_t idx; | ||
194 | + | ||
195 | + if (!vfio_multifd_transfer_enabled(vbasedev)) { | ||
196 | + /* Nothing to do, vfio_save_complete_precopy() does the transfer. */ | ||
197 | + return true; | ||
198 | + } | ||
199 | + | ||
200 | + trace_vfio_save_complete_precopy_thread_start(vbasedev->name, | ||
201 | + d->idstr, d->instance_id); | ||
202 | + | ||
203 | + /* We reach here with device state STOP or STOP_COPY only */ | ||
204 | + if (vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_STOP_COPY, | ||
205 | + VFIO_DEVICE_STATE_STOP, errp)) { | ||
206 | + goto thread_exit; | ||
207 | + } | ||
208 | + | ||
209 | + packet = g_malloc0(sizeof(*packet) + migration->data_buffer_size); | ||
210 | + packet->version = VFIO_DEVICE_STATE_PACKET_VER_CURRENT; | ||
211 | + | ||
212 | + for (idx = 0; ; idx++) { | ||
213 | + ssize_t data_size; | ||
214 | + size_t packet_size; | ||
215 | + | ||
216 | + if (multifd_device_state_save_thread_should_exit()) { | ||
217 | + error_setg(errp, "operation cancelled"); | ||
218 | + goto thread_exit; | ||
219 | + } | ||
220 | + | ||
221 | + data_size = read(migration->data_fd, &packet->data, | ||
222 | + migration->data_buffer_size); | ||
223 | + if (data_size < 0) { | ||
224 | + error_setg(errp, "%s: reading state buffer %" PRIu32 " failed: %d", | ||
225 | + vbasedev->name, idx, errno); | ||
226 | + goto thread_exit; | ||
227 | + } else if (data_size == 0) { | ||
228 | + break; | ||
229 | + } | ||
230 | + | ||
231 | + packet->idx = idx; | ||
232 | + packet_size = sizeof(*packet) + data_size; | ||
233 | + | ||
234 | + if (!multifd_queue_device_state(d->idstr, d->instance_id, | ||
235 | + (char *)packet, packet_size)) { | ||
236 | + error_setg(errp, "%s: multifd data queuing failed", vbasedev->name); | ||
237 | + goto thread_exit; | ||
238 | + } | ||
239 | + | ||
240 | + vfio_mig_add_bytes_transferred(packet_size); | ||
241 | + } | ||
242 | + | ||
243 | + if (!vfio_save_complete_precopy_thread_config_state(vbasedev, | ||
244 | + d->idstr, | ||
245 | + d->instance_id, | ||
246 | + idx, errp)) { | ||
247 | + goto thread_exit; | ||
248 | + } | ||
249 | + | ||
250 | + ret = true; | ||
251 | + | ||
252 | +thread_exit: | ||
253 | + trace_vfio_save_complete_precopy_thread_end(vbasedev->name, ret); | ||
254 | + | ||
255 | + return ret; | ||
256 | +} | ||
257 | + | ||
258 | int vfio_multifd_switchover_start(VFIODevice *vbasedev) | ||
259 | { | ||
260 | VFIOMigration *migration = vbasedev->migration; | ||
261 | diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c | ||
262 | index XXXXXXX..XXXXXXX 100644 | ||
263 | --- a/hw/vfio/migration.c | ||
264 | +++ b/hw/vfio/migration.c | ||
265 | @@ -XXX,XX +XXX,XX @@ static void vfio_migration_set_device_state(VFIODevice *vbasedev, | ||
266 | vfio_migration_send_event(vbasedev); | ||
267 | } | ||
268 | |||
269 | -static int vfio_migration_set_state(VFIODevice *vbasedev, | ||
270 | - enum vfio_device_mig_state new_state, | ||
271 | - enum vfio_device_mig_state recover_state, | ||
272 | - Error **errp) | ||
273 | +int vfio_migration_set_state(VFIODevice *vbasedev, | ||
274 | + enum vfio_device_mig_state new_state, | ||
275 | + enum vfio_device_mig_state recover_state, | ||
276 | + Error **errp) | ||
277 | { | ||
278 | VFIOMigration *migration = vbasedev->migration; | ||
279 | uint64_t buf[DIV_ROUND_UP(sizeof(struct vfio_device_feature) + | ||
280 | @@ -XXX,XX +XXX,XX @@ static int vfio_load_buffer(QEMUFile *f, VFIODevice *vbasedev, | ||
281 | return ret; | ||
282 | } | ||
283 | |||
284 | -static int vfio_save_device_config_state(QEMUFile *f, void *opaque, | ||
285 | - Error **errp) | ||
286 | +int vfio_save_device_config_state(QEMUFile *f, void *opaque, Error **errp) | ||
287 | { | ||
288 | VFIODevice *vbasedev = opaque; | ||
289 | int ret; | ||
290 | @@ -XXX,XX +XXX,XX @@ static int vfio_save_complete_precopy(QEMUFile *f, void *opaque) | ||
291 | int ret; | ||
292 | Error *local_err = NULL; | ||
293 | |||
294 | + if (vfio_multifd_transfer_enabled(vbasedev)) { | ||
295 | + vfio_multifd_emit_dummy_eos(vbasedev, f); | ||
296 | + return 0; | ||
297 | + } | ||
298 | + | ||
299 | trace_vfio_save_complete_precopy_start(vbasedev->name); | ||
300 | |||
301 | /* We reach here with device state STOP or STOP_COPY only */ | ||
302 | @@ -XXX,XX +XXX,XX @@ static void vfio_save_state(QEMUFile *f, void *opaque) | ||
303 | Error *local_err = NULL; | ||
304 | int ret; | ||
305 | |||
306 | + if (vfio_multifd_transfer_enabled(vbasedev)) { | ||
307 | + vfio_multifd_emit_dummy_eos(vbasedev, f); | ||
308 | + return; | ||
309 | + } | ||
310 | + | ||
311 | ret = vfio_save_device_config_state(f, opaque, &local_err); | ||
312 | if (ret) { | ||
313 | error_prepend(&local_err, | ||
314 | @@ -XXX,XX +XXX,XX @@ static const SaveVMHandlers savevm_vfio_handlers = { | ||
315 | */ | ||
316 | .load_state_buffer = vfio_multifd_load_state_buffer, | ||
317 | .switchover_start = vfio_switchover_start, | ||
318 | + .save_live_complete_precopy_thread = vfio_multifd_save_complete_precopy_thread, | ||
319 | }; | ||
320 | |||
321 | /* ---------------------------------------------------------------------- */ | ||
322 | diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events | ||
323 | index XXXXXXX..XXXXXXX 100644 | ||
324 | --- a/hw/vfio/trace-events | ||
325 | +++ b/hw/vfio/trace-events | ||
326 | @@ -XXX,XX +XXX,XX @@ vfio_save_block_precopy_empty_hit(const char *name) " (%s)" | ||
327 | vfio_save_cleanup(const char *name) " (%s)" | ||
328 | vfio_save_complete_precopy(const char *name, int ret) " (%s) ret %d" | ||
329 | vfio_save_complete_precopy_start(const char *name) " (%s)" | ||
330 | +vfio_save_complete_precopy_thread_start(const char *name, const char *idstr, uint32_t instance_id) " (%s) idstr %s instance %"PRIu32 | ||
331 | +vfio_save_complete_precopy_thread_end(const char *name, int ret) " (%s) ret %d" | ||
332 | vfio_save_device_config_state(const char *name) " (%s)" | ||
333 | vfio_save_iterate(const char *name, uint64_t precopy_init_size, uint64_t precopy_dirty_size) " (%s) precopy initial size %"PRIu64" precopy dirty size %"PRIu64 | ||
334 | vfio_save_iterate_start(const char *name) " (%s)" | ||
335 | -- | ||
336 | 2.48.1 | ||
337 | |||
338 | diff view generated by jsdifflib |
Deleted patch | |||
---|---|---|---|
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | ||
2 | 1 | ||
3 | This property allows configuring whether to transfer the particular device | ||
4 | state via multifd channels when live migrating that device. | ||
5 | |||
6 | It defaults to AUTO, which means that VFIO device state transfer via | ||
7 | multifd channels is attempted in configurations that otherwise support it. | ||
8 | |||
9 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | ||
10 | Reviewed-by: Cédric Le Goater <clg@redhat.com> | ||
11 | Link: https://lore.kernel.org/qemu-devel/d6dbb326e3d53c7104d62c96c9e3dd64e1c7b940.1741124640.git.maciej.szmigiero@oracle.com | ||
12 | [ clg: Added documentation ] | ||
13 | Signed-off-by: Cédric Le Goater <clg@redhat.com> | ||
14 | --- | ||
15 | docs/devel/migration/vfio.rst | 15 +++++++++++++++ | ||
16 | include/hw/vfio/vfio-common.h | 2 ++ | ||
17 | hw/vfio/migration-multifd.c | 18 +++++++++++++++++- | ||
18 | hw/vfio/pci.c | 7 +++++++ | ||
19 | 4 files changed, 41 insertions(+), 1 deletion(-) | ||
20 | |||
21 | diff --git a/docs/devel/migration/vfio.rst b/docs/devel/migration/vfio.rst | ||
22 | index XXXXXXX..XXXXXXX 100644 | ||
23 | --- a/docs/devel/migration/vfio.rst | ||
24 | +++ b/docs/devel/migration/vfio.rst | ||
25 | @@ -XXX,XX +XXX,XX @@ Postcopy | ||
26 | ======== | ||
27 | |||
28 | Postcopy migration is currently not supported for VFIO devices. | ||
29 | + | ||
30 | +Multifd | ||
31 | +======= | ||
32 | + | ||
33 | +Starting from QEMU version 10.0 there's a possibility to transfer VFIO device | ||
34 | +_STOP_COPY state via multifd channels. This helps reduce downtime - especially | ||
35 | +with multiple VFIO devices or with devices having a large migration state. | ||
36 | +As an additional benefit, setting the VFIO device to _STOP_COPY state and | ||
37 | +saving its config space is also parallelized (run in a separate thread) in | ||
38 | +such migration mode. | ||
39 | + | ||
40 | +The multifd VFIO device state transfer is controlled by | ||
41 | +"x-migration-multifd-transfer" VFIO device property. This property defaults to | ||
42 | +AUTO, which means that VFIO device state transfer via multifd channels is | ||
43 | +attempted in configurations that otherwise support it. | ||
44 | diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h | ||
45 | index XXXXXXX..XXXXXXX 100644 | ||
46 | --- a/include/hw/vfio/vfio-common.h | ||
47 | +++ b/include/hw/vfio/vfio-common.h | ||
48 | @@ -XXX,XX +XXX,XX @@ typedef struct VFIOMigration { | ||
49 | uint64_t mig_flags; | ||
50 | uint64_t precopy_init_size; | ||
51 | uint64_t precopy_dirty_size; | ||
52 | + bool multifd_transfer; | ||
53 | VFIOMultifd *multifd; | ||
54 | bool initial_data_sent; | ||
55 | |||
56 | @@ -XXX,XX +XXX,XX @@ typedef struct VFIODevice { | ||
57 | bool no_mmap; | ||
58 | bool ram_block_discard_allowed; | ||
59 | OnOffAuto enable_migration; | ||
60 | + OnOffAuto migration_multifd_transfer; | ||
61 | bool migration_events; | ||
62 | VFIODeviceOps *ops; | ||
63 | unsigned int num_irqs; | ||
64 | diff --git a/hw/vfio/migration-multifd.c b/hw/vfio/migration-multifd.c | ||
65 | index XXXXXXX..XXXXXXX 100644 | ||
66 | --- a/hw/vfio/migration-multifd.c | ||
67 | +++ b/hw/vfio/migration-multifd.c | ||
68 | @@ -XXX,XX +XXX,XX @@ bool vfio_multifd_transfer_supported(void) | ||
69 | |||
70 | bool vfio_multifd_transfer_enabled(VFIODevice *vbasedev) | ||
71 | { | ||
72 | - return false; | ||
73 | + VFIOMigration *migration = vbasedev->migration; | ||
74 | + | ||
75 | + return migration->multifd_transfer; | ||
76 | } | ||
77 | |||
78 | bool vfio_multifd_setup(VFIODevice *vbasedev, bool alloc_multifd, Error **errp) | ||
79 | { | ||
80 | VFIOMigration *migration = vbasedev->migration; | ||
81 | |||
82 | + if (vbasedev->migration_multifd_transfer == ON_OFF_AUTO_AUTO) { | ||
83 | + migration->multifd_transfer = vfio_multifd_transfer_supported(); | ||
84 | + } else { | ||
85 | + migration->multifd_transfer = | ||
86 | + vbasedev->migration_multifd_transfer == ON_OFF_AUTO_ON; | ||
87 | + } | ||
88 | + | ||
89 | if (!vfio_multifd_transfer_enabled(vbasedev)) { | ||
90 | /* Nothing further to check or do */ | ||
91 | return true; | ||
92 | } | ||
93 | |||
94 | + if (!vfio_multifd_transfer_supported()) { | ||
95 | + error_setg(errp, | ||
96 | + "%s: Multifd device transfer requested but unsupported in the current config", | ||
97 | + vbasedev->name); | ||
98 | + return false; | ||
99 | + } | ||
100 | + | ||
101 | if (alloc_multifd) { | ||
102 | assert(!migration->multifd); | ||
103 | migration->multifd = vfio_multifd_new(); | ||
104 | diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c | ||
105 | index XXXXXXX..XXXXXXX 100644 | ||
106 | --- a/hw/vfio/pci.c | ||
107 | +++ b/hw/vfio/pci.c | ||
108 | @@ -XXX,XX +XXX,XX @@ static const Property vfio_pci_dev_properties[] = { | ||
109 | VFIO_FEATURE_ENABLE_IGD_OPREGION_BIT, false), | ||
110 | DEFINE_PROP_ON_OFF_AUTO("enable-migration", VFIOPCIDevice, | ||
111 | vbasedev.enable_migration, ON_OFF_AUTO_AUTO), | ||
112 | + DEFINE_PROP_ON_OFF_AUTO("x-migration-multifd-transfer", VFIOPCIDevice, | ||
113 | + vbasedev.migration_multifd_transfer, | ||
114 | + ON_OFF_AUTO_AUTO), | ||
115 | DEFINE_PROP_BOOL("migration-events", VFIOPCIDevice, | ||
116 | vbasedev.migration_events, false), | ||
117 | DEFINE_PROP_BOOL("x-no-mmap", VFIOPCIDevice, vbasedev.no_mmap, false), | ||
118 | @@ -XXX,XX +XXX,XX @@ static void vfio_pci_dev_class_init(ObjectClass *klass, void *data) | ||
119 | "Skip config space check for Vendor Specific Capability. " | ||
120 | "Setting to false will enforce strict checking of VSC content " | ||
121 | "(DEBUG)"); | ||
122 | + object_class_property_set_description(klass, /* 10.0 */ | ||
123 | + "x-migration-multifd-transfer", | ||
124 | + "Transfer this device state via " | ||
125 | + "multifd channels when live migrating it"); | ||
126 | } | ||
127 | |||
128 | static const TypeInfo vfio_pci_dev_info = { | ||
129 | -- | ||
130 | 2.48.1 | ||
131 | |||
132 | diff view generated by jsdifflib |
Deleted patch | |||
---|---|---|---|
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | ||
2 | 1 | ||
3 | DEFINE_PROP_ON_OFF_AUTO() property isn't runtime-mutable so using it | ||
4 | would mean that the source VM would need to decide upfront at startup | ||
5 | time whether it wants to do a multifd device state transfer at some | ||
6 | point. | ||
7 | |||
8 | Source VM can run for a long time before being migrated so it is | ||
9 | desirable to have a fallback mechanism to the old way of transferring | ||
10 | VFIO device state if it turns to be necessary. | ||
11 | |||
12 | This brings this property to the same mutability level as ordinary | ||
13 | migration parameters, which too can be adjusted at the run time. | ||
14 | |||
15 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | ||
16 | Reviewed-by: Cédric Le Goater <clg@redhat.com> | ||
17 | Link: https://lore.kernel.org/qemu-devel/f2f2d66bda477da3e6cb8c0311006cff36e8651d.1741124640.git.maciej.szmigiero@oracle.com | ||
18 | Signed-off-by: Cédric Le Goater <clg@redhat.com> | ||
19 | --- | ||
20 | hw/vfio/migration-multifd.c | 4 ++++ | ||
21 | hw/vfio/pci.c | 20 +++++++++++++++++--- | ||
22 | 2 files changed, 21 insertions(+), 3 deletions(-) | ||
23 | |||
24 | diff --git a/hw/vfio/migration-multifd.c b/hw/vfio/migration-multifd.c | ||
25 | index XXXXXXX..XXXXXXX 100644 | ||
26 | --- a/hw/vfio/migration-multifd.c | ||
27 | +++ b/hw/vfio/migration-multifd.c | ||
28 | @@ -XXX,XX +XXX,XX @@ bool vfio_multifd_setup(VFIODevice *vbasedev, bool alloc_multifd, Error **errp) | ||
29 | { | ||
30 | VFIOMigration *migration = vbasedev->migration; | ||
31 | |||
32 | + /* | ||
33 | + * Make a copy of this setting at the start in case it is changed | ||
34 | + * mid-migration. | ||
35 | + */ | ||
36 | if (vbasedev->migration_multifd_transfer == ON_OFF_AUTO_AUTO) { | ||
37 | migration->multifd_transfer = vfio_multifd_transfer_supported(); | ||
38 | } else { | ||
39 | diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c | ||
40 | index XXXXXXX..XXXXXXX 100644 | ||
41 | --- a/hw/vfio/pci.c | ||
42 | +++ b/hw/vfio/pci.c | ||
43 | @@ -XXX,XX +XXX,XX @@ static void vfio_instance_init(Object *obj) | ||
44 | pci_dev->cap_present |= QEMU_PCI_CAP_EXPRESS; | ||
45 | } | ||
46 | |||
47 | +static PropertyInfo vfio_pci_migration_multifd_transfer_prop; | ||
48 | + | ||
49 | static const Property vfio_pci_dev_properties[] = { | ||
50 | DEFINE_PROP_PCI_HOST_DEVADDR("host", VFIOPCIDevice, host), | ||
51 | DEFINE_PROP_UUID_NODEFAULT("vf-token", VFIOPCIDevice, vf_token), | ||
52 | @@ -XXX,XX +XXX,XX @@ static const Property vfio_pci_dev_properties[] = { | ||
53 | VFIO_FEATURE_ENABLE_IGD_OPREGION_BIT, false), | ||
54 | DEFINE_PROP_ON_OFF_AUTO("enable-migration", VFIOPCIDevice, | ||
55 | vbasedev.enable_migration, ON_OFF_AUTO_AUTO), | ||
56 | - DEFINE_PROP_ON_OFF_AUTO("x-migration-multifd-transfer", VFIOPCIDevice, | ||
57 | - vbasedev.migration_multifd_transfer, | ||
58 | - ON_OFF_AUTO_AUTO), | ||
59 | + DEFINE_PROP("x-migration-multifd-transfer", VFIOPCIDevice, | ||
60 | + vbasedev.migration_multifd_transfer, | ||
61 | + vfio_pci_migration_multifd_transfer_prop, OnOffAuto, | ||
62 | + .set_default = true, .defval.i = ON_OFF_AUTO_AUTO), | ||
63 | DEFINE_PROP_BOOL("migration-events", VFIOPCIDevice, | ||
64 | vbasedev.migration_events, false), | ||
65 | DEFINE_PROP_BOOL("x-no-mmap", VFIOPCIDevice, vbasedev.no_mmap, false), | ||
66 | @@ -XXX,XX +XXX,XX @@ static const TypeInfo vfio_pci_nohotplug_dev_info = { | ||
67 | |||
68 | static void register_vfio_pci_dev_type(void) | ||
69 | { | ||
70 | + /* | ||
71 | + * Ordinary ON_OFF_AUTO property isn't runtime-mutable, but source VM can | ||
72 | + * run for a long time before being migrated so it is desirable to have a | ||
73 | + * fallback mechanism to the old way of transferring VFIO device state if | ||
74 | + * it turns to be necessary. | ||
75 | + * The following makes this type of property have the same mutability level | ||
76 | + * as ordinary migration parameters. | ||
77 | + */ | ||
78 | + vfio_pci_migration_multifd_transfer_prop = qdev_prop_on_off_auto; | ||
79 | + vfio_pci_migration_multifd_transfer_prop.realized_set_allowed = true; | ||
80 | + | ||
81 | type_register_static(&vfio_pci_dev_info); | ||
82 | type_register_static(&vfio_pci_nohotplug_dev_info); | ||
83 | } | ||
84 | -- | ||
85 | 2.48.1 | ||
86 | |||
87 | diff view generated by jsdifflib |
Deleted patch | |||
---|---|---|---|
1 | From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> | ||
2 | 1 | ||
3 | Add a hw_compat entry for recently added x-migration-multifd-transfer VFIO | ||
4 | property. | ||
5 | |||
6 | Reviewed-by: Cédric Le Goater <clg@redhat.com> | ||
7 | Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> | ||
8 | Link: https://lore.kernel.org/qemu-devel/92c354f0457c152d1f267cc258c6967fff551cb1.1741124640.git.maciej.szmigiero@oracle.com | ||
9 | Signed-off-by: Cédric Le Goater <clg@redhat.com> | ||
10 | --- | ||
11 | hw/core/machine.c | 1 + | ||
12 | 1 file changed, 1 insertion(+) | ||
13 | |||
14 | diff --git a/hw/core/machine.c b/hw/core/machine.c | ||
15 | index XXXXXXX..XXXXXXX 100644 | ||
16 | --- a/hw/core/machine.c | ||
17 | +++ b/hw/core/machine.c | ||
18 | @@ -XXX,XX +XXX,XX @@ GlobalProperty hw_compat_9_2[] = { | ||
19 | { "virtio-mem-pci", "vectors", "0" }, | ||
20 | { "migration", "multifd-clean-tls-termination", "false" }, | ||
21 | { "migration", "send-switchover-start", "off"}, | ||
22 | + { "vfio-pci", "x-migration-multifd-transfer", "off" }, | ||
23 | }; | ||
24 | const size_t hw_compat_9_2_len = G_N_ELEMENTS(hw_compat_9_2); | ||
25 | |||
26 | -- | ||
27 | 2.48.1 | ||
28 | |||
29 | diff view generated by jsdifflib |