[PATCH v2 13/17] migration/multifd: Add migration_has_device_state_support()

Maciej S. Szmigiero posted 17 patches 2 months, 4 weeks ago
There is a newer version of this series
[PATCH v2 13/17] migration/multifd: Add migration_has_device_state_support()
Posted by Maciej S. Szmigiero 2 months, 4 weeks ago
From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>

Since device state transfer via multifd channels requires multifd
channels with packets and is currently not compatible with multifd
compression add an appropriate query function so device can learn
whether it can actually make use of it.

Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
---
 include/migration/misc.h         | 1 +
 migration/multifd-device-state.c | 7 +++++++
 2 files changed, 8 insertions(+)

diff --git a/include/migration/misc.h b/include/migration/misc.h
index 7266b1b77d1f..189de6d02ad6 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -114,5 +114,6 @@ void dirty_bitmap_mig_init(void);
 /* migration/multifd-device-state.c */
 bool multifd_queue_device_state(char *idstr, uint32_t instance_id,
                                 char *data, size_t len);
+bool migration_has_device_state_support(void);
 
 #endif
diff --git a/migration/multifd-device-state.c b/migration/multifd-device-state.c
index c9b44f0b5ab9..7b34fe736c7f 100644
--- a/migration/multifd-device-state.c
+++ b/migration/multifd-device-state.c
@@ -11,6 +11,7 @@
 #include "qemu/lockable.h"
 #include "migration/misc.h"
 #include "multifd.h"
+#include "options.h"
 
 static QemuMutex queue_job_mutex;
 
@@ -97,3 +98,9 @@ bool multifd_queue_device_state(char *idstr, uint32_t instance_id,
 
     return true;
 }
+
+bool migration_has_device_state_support(void)
+{
+    return migrate_multifd() && !migrate_mapped_ram() &&
+        migrate_multifd_compression() == MULTIFD_COMPRESSION_NONE;
+}
Re: [PATCH v2 13/17] migration/multifd: Add migration_has_device_state_support()
Posted by Fabiano Rosas 2 months, 3 weeks ago
"Maciej S. Szmigiero" <mail@maciej.szmigiero.name> writes:

> From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
>
> Since device state transfer via multifd channels requires multifd
> channels with packets and is currently not compatible with multifd
> compression add an appropriate query function so device can learn
> whether it can actually make use of it.
>
> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>

Reviewed-by: Fabiano Rosas <farosas@suse.de>

Out of curiosity, what do you see as a blocker for migrating to a file?

We would just need to figure out a mapping from file offset some unit of
data to be able to write in parallel like with ram (of which the page
offset is mapped to the file offset).

> ---
>  include/migration/misc.h         | 1 +
>  migration/multifd-device-state.c | 7 +++++++
>  2 files changed, 8 insertions(+)
>
> diff --git a/include/migration/misc.h b/include/migration/misc.h
> index 7266b1b77d1f..189de6d02ad6 100644
> --- a/include/migration/misc.h
> +++ b/include/migration/misc.h
> @@ -114,5 +114,6 @@ void dirty_bitmap_mig_init(void);
>  /* migration/multifd-device-state.c */
>  bool multifd_queue_device_state(char *idstr, uint32_t instance_id,
>                                  char *data, size_t len);
> +bool migration_has_device_state_support(void);
>  
>  #endif
> diff --git a/migration/multifd-device-state.c b/migration/multifd-device-state.c
> index c9b44f0b5ab9..7b34fe736c7f 100644
> --- a/migration/multifd-device-state.c
> +++ b/migration/multifd-device-state.c
> @@ -11,6 +11,7 @@
>  #include "qemu/lockable.h"
>  #include "migration/misc.h"
>  #include "multifd.h"
> +#include "options.h"
>  
>  static QemuMutex queue_job_mutex;
>  
> @@ -97,3 +98,9 @@ bool multifd_queue_device_state(char *idstr, uint32_t instance_id,
>  
>      return true;
>  }
> +
> +bool migration_has_device_state_support(void)
> +{
> +    return migrate_multifd() && !migrate_mapped_ram() &&
> +        migrate_multifd_compression() == MULTIFD_COMPRESSION_NONE;
> +}
Re: [PATCH v2 13/17] migration/multifd: Add migration_has_device_state_support()
Posted by Maciej S. Szmigiero 2 months, 3 weeks ago
On 30.08.2024 20:55, Fabiano Rosas wrote:
> "Maciej S. Szmigiero" <mail@maciej.szmigiero.name> writes:
> 
>> From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
>>
>> Since device state transfer via multifd channels requires multifd
>> channels with packets and is currently not compatible with multifd
>> compression add an appropriate query function so device can learn
>> whether it can actually make use of it.
>>
>> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
> 
> Reviewed-by: Fabiano Rosas <farosas@suse.de>
> 
> Out of curiosity, what do you see as a blocker for migrating to a file?
> 
> We would just need to figure out a mapping from file offset some unit of
> data to be able to write in parallel like with ram (of which the page
> offset is mapped to the file offset).

I'm not sure whether there's a point in that since VFIO devices
just provide a raw device state stream - there's no way to know
that some buffer is no longer needed because it consisted of
dirty data that was completely overwritten by a later buffer.

Also, the device type that the code was developed against - a (smart)
NIC - has so large device state because (more or less) it keeps a lot
of data about network connections passing / made through it.

It doesn't really make sense to make snapshot of such device for later
reload since these connections will be long dropped by their remote
peers by this point.

Such snapshotting might make more sense with GPU VFIO devices though.

If such file migration support is desired at some later point then for
sure the whole code would need to be carefully re-checked for implicit
assumptions.

Thanks,
Maciej
Re: [PATCH v2 13/17] migration/multifd: Add migration_has_device_state_support()
Posted by Fabiano Rosas 2 months, 3 weeks ago
"Maciej S. Szmigiero" <mail@maciej.szmigiero.name> writes:

> On 30.08.2024 20:55, Fabiano Rosas wrote:
>> "Maciej S. Szmigiero" <mail@maciej.szmigiero.name> writes:
>> 
>>> From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
>>>
>>> Since device state transfer via multifd channels requires multifd
>>> channels with packets and is currently not compatible with multifd
>>> compression add an appropriate query function so device can learn
>>> whether it can actually make use of it.
>>>
>>> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
>> 
>> Reviewed-by: Fabiano Rosas <farosas@suse.de>
>> 
>> Out of curiosity, what do you see as a blocker for migrating to a file?
>> 
>> We would just need to figure out a mapping from file offset some unit of
>> data to be able to write in parallel like with ram (of which the page
>> offset is mapped to the file offset).
>
> I'm not sure whether there's a point in that since VFIO devices
> just provide a raw device state stream - there's no way to know
> that some buffer is no longer needed because it consisted of
> dirty data that was completely overwritten by a later buffer.
>
> Also, the device type that the code was developed against - a (smart)
> NIC - has so large device state because (more or less) it keeps a lot
> of data about network connections passing / made through it.
>
> It doesn't really make sense to make snapshot of such device for later
> reload since these connections will be long dropped by their remote
> peers by this point.
>
> Such snapshotting might make more sense with GPU VFIO devices though.
>
> If such file migration support is desired at some later point then for
> sure the whole code would need to be carefully re-checked for implicit
> assumptions.

Thanks, let's keep those arguments in mind, maybe we put them in a doc
somewhere so we remember this in the future.

>
> Thanks,
> Maciej