[v3] vhost-user-blk: support inflight migration

[PATCH v3 0/3] vhost-user-blk: support inflight migration

Posted by Alexandr Moshkov 3 months ago

v3:
- use pre_load_errp instead of pre_load in vhost.c
- change vhost-user-blk property to
  "skip-get-vring-base-inflight-migration"
- refactor vhost-user-blk.c, by moving vhost_user_blk_inflight_needed() higher

v2:
- rewrite migration using VMSD instead of qemufile API
- add vhost-user-blk parameter instead of migration capability

I don't know if VMSD was used cleanly in migration implementation, so
feel free for comments.

Based on Vladimir's work:
[PATCH v2 00/25] vhost-user-blk: live-backend local migration
  which was based on:
    - [PATCH v4 0/7] chardev: postpone connect
      (which in turn is based on [PATCH 0/2] remove deprecated 'reconnect' options)
    - [PATCH v3 00/23] vhost refactoring and fixes
    - [PATCH v8 14/19] migration: introduce .pre_incoming() vmsd handler

Based-on: <20250924133309.334631-1-vsementsov@yandex-team.ru>
Based-on: <20251015212051.1156334-1-vsementsov@yandex-team.ru>
Based-on: <20251015145808.1112843-1-vsementsov@yandex-team.ru>
Based-on: <20251015132136.1083972-15-vsementsov@yandex-team.ru>
Based-on: <20251016114104.1384675-1-vsementsov@yandex-team.ru>

--- 

Hi!

During inter-host migration, waiting for disk requests to be drained
in the vhost-user backend can incur significant downtime.

This can be avoided if QEMU migrates the inflight region in vhost-user-blk. 
Thus, during the qemu migration, the vhost-user backend can cancel all inflight requests and
then, after migration, they will be executed on another host. 

At first, I tried to implement migration for all vhost-user devices that support inflight at once, 
but this would require a lot of changes both in vhost-user-blk (to transfer it to the base class) and 
in the vhost-user-base base class (inflight implementation and remodeling + a large refactor).

Therefore, for now I decided to leave this idea for later and 
implement the migration of the inflight region first for vhost-user-blk.

Alexandr Moshkov (3):
  vmstate: introduce VMSTATE_VBUFFER_UINT64
  vhost: add vmstate for inflight region with inner buffer
  vhost-user-blk: support inter-host inflight migration

 hw/block/vhost-user-blk.c          | 29 +++++++++++++++++++++
 hw/virtio/vhost.c                  | 42 ++++++++++++++++++++++++++++++
 include/hw/virtio/vhost-user-blk.h |  1 +
 include/hw/virtio/vhost.h          |  6 +++++
 include/migration/vmstate.h        | 10 +++++++
 5 files changed, 88 insertions(+)

-- 
2.34.1

Re: [PATCH v3 0/3] vhost-user-blk: support inflight migration

Posted by Vladimir Sementsov-Ogievskiy 2 months, 3 weeks ago

Add Daniel

On 10.11.25 13:39, Alexandr Moshkov wrote:
> v3:
> - use pre_load_errp instead of pre_load in vhost.c
> - change vhost-user-blk property to
>    "skip-get-vring-base-inflight-migration"
> - refactor vhost-user-blk.c, by moving vhost_user_blk_inflight_needed() higher
> 
> v2:
> - rewrite migration using VMSD instead of qemufile API
> - add vhost-user-blk parameter instead of migration capability
> 
> I don't know if VMSD was used cleanly in migration implementation, so
> feel free for comments.
> 
> Based on Vladimir's work:
> [PATCH v2 00/25] vhost-user-blk: live-backend local migration
>    which was based on:
>      - [PATCH v4 0/7] chardev: postpone connect
>        (which in turn is based on [PATCH 0/2] remove deprecated 'reconnect' options)
>      - [PATCH v3 00/23] vhost refactoring and fixes
>      - [PATCH v8 14/19] migration: introduce .pre_incoming() vmsd handler
> 

Hi!

On my series about backend-transfer migration, the final consensus (or at least,
I hope that it's a consensus:) is that using device properties to control migration
channel content is wrong. And we should instead use migration parameters.

(discussion here: https://lore.kernel.org/qemu-devel/29aa1d66-9fa7-4e44-b0e3-2ca26e77accf@yandex-team.ru/ )

So the API for backend-transfer features is a migration parameter

     backend-transfer = [ list of QOM paths of devices, for which we want to enable backend-transfer ]

and user don't have to change device properties in runtime to setup the following migration.

So I assume, similar practice should be applied here: don't use device
properties to control migration.

So, should it be a parameter like

     migrate-inflight-region = [ list of QOM paths of vhost-user devices ]

?


> Based-on: <20250924133309.334631-1-vsementsov@yandex-team.ru>
> Based-on: <20251015212051.1156334-1-vsementsov@yandex-team.ru>
> Based-on: <20251015145808.1112843-1-vsementsov@yandex-team.ru>
> Based-on: <20251015132136.1083972-15-vsementsov@yandex-team.ru>
> Based-on: <20251016114104.1384675-1-vsementsov@yandex-team.ru>
> 
> ---
> 
> Hi!
> 
> During inter-host migration, waiting for disk requests to be drained
> in the vhost-user backend can incur significant downtime.
> 
> This can be avoided if QEMU migrates the inflight region in vhost-user-blk.
> Thus, during the qemu migration, the vhost-user backend can cancel all inflight requests and
> then, after migration, they will be executed on another host.
> 
> At first, I tried to implement migration for all vhost-user devices that support inflight at once,
> but this would require a lot of changes both in vhost-user-blk (to transfer it to the base class) and
> in the vhost-user-base base class (inflight implementation and remodeling + a large refactor).
> 
> Therefore, for now I decided to leave this idea for later and
> implement the migration of the inflight region first for vhost-user-blk.
> 
> Alexandr Moshkov (3):
>    vmstate: introduce VMSTATE_VBUFFER_UINT64
>    vhost: add vmstate for inflight region with inner buffer
>    vhost-user-blk: support inter-host inflight migration
> 
>   hw/block/vhost-user-blk.c          | 29 +++++++++++++++++++++
>   hw/virtio/vhost.c                  | 42 ++++++++++++++++++++++++++++++
>   include/hw/virtio/vhost-user-blk.h |  1 +
>   include/hw/virtio/vhost.h          |  6 +++++
>   include/migration/vmstate.h        | 10 +++++++
>   5 files changed, 88 insertions(+)
> 


-- 
Best regards,
Vladimir

Re: [PATCH v3 0/3] vhost-user-blk: support inflight migration

Posted by Peter Xu 2 months, 3 weeks ago

On Tue, Nov 18, 2025 at 11:24:12PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> Add Daniel
> 
> On 10.11.25 13:39, Alexandr Moshkov wrote:
> > v3:
> > - use pre_load_errp instead of pre_load in vhost.c
> > - change vhost-user-blk property to
> >    "skip-get-vring-base-inflight-migration"
> > - refactor vhost-user-blk.c, by moving vhost_user_blk_inflight_needed() higher
> > 
> > v2:
> > - rewrite migration using VMSD instead of qemufile API
> > - add vhost-user-blk parameter instead of migration capability
> > 
> > I don't know if VMSD was used cleanly in migration implementation, so
> > feel free for comments.
> > 
> > Based on Vladimir's work:
> > [PATCH v2 00/25] vhost-user-blk: live-backend local migration
> >    which was based on:
> >      - [PATCH v4 0/7] chardev: postpone connect
> >        (which in turn is based on [PATCH 0/2] remove deprecated 'reconnect' options)
> >      - [PATCH v3 00/23] vhost refactoring and fixes
> >      - [PATCH v8 14/19] migration: introduce .pre_incoming() vmsd handler
> > 
> 
> Hi!
> 
> On my series about backend-transfer migration, the final consensus (or at least,
> I hope that it's a consensus:) is that using device properties to control migration
> channel content is wrong. And we should instead use migration parameters.
> 
> (discussion here: https://lore.kernel.org/qemu-devel/29aa1d66-9fa7-4e44-b0e3-2ca26e77accf@yandex-team.ru/ )
> 
> So the API for backend-transfer features is a migration parameter
> 
>     backend-transfer = [ list of QOM paths of devices, for which we want to enable backend-transfer ]
> 
> and user don't have to change device properties in runtime to setup the following migration.
> 
> So I assume, similar practice should be applied here: don't use device
> properties to control migration.
> 
> So, should it be a parameter like
> 
>     migrate-inflight-region = [ list of QOM paths of vhost-user devices ]
> 
> ?

I have concern that if we start doing this more, migration qapi/ will be
completely messed up.

Imagine a world where there'll be tons of lists like:

  migrate-dev1-some-feature-1 = [list of devices (almost only dev1 typed)]
  migrate-dev2-some-feature-2 = [list of devices (almost only dev2 typed)]
  migrate-dev3-some-feature-3 = [list of devices (almost only dev3 typed)]
  ...

That doesn't look reasonable at all.  If some feature is likely only
supported in one device, that should not appear in migration.json but only
in the specific device.

I don't think I'm fully convinced we can't enable some form of machine type
properties (with QDEV or not) on backends we should stick with something
like that.  I can have some closer look this week, but.. even if not, I
still think migration shouldn't care about some specific behavior of a
specific device.

If we really want to have some way to probe device features, maybe we
should also think about a generic interface (rather than "one new list
every time").  We also have some recent discussions on a proper interface
to query TAP backend features like USO*.  Maybe they share some of the
goals here.

Thanks,

-- 
Peter Xu

Re: [PATCH v3 0/3] vhost-user-blk: support inflight migration

Posted by Vladimir Sementsov-Ogievskiy 2 months, 1 week ago

On 19.11.25 01:05, Peter Xu wrote:
> On Tue, Nov 18, 2025 at 11:24:12PM +0300, Vladimir Sementsov-Ogievskiy wrote:
>> Add Daniel
>>
>> On 10.11.25 13:39, Alexandr Moshkov wrote:
>>> v3:
>>> - use pre_load_errp instead of pre_load in vhost.c
>>> - change vhost-user-blk property to
>>>     "skip-get-vring-base-inflight-migration"
>>> - refactor vhost-user-blk.c, by moving vhost_user_blk_inflight_needed() higher
>>>
>>> v2:
>>> - rewrite migration using VMSD instead of qemufile API
>>> - add vhost-user-blk parameter instead of migration capability
>>>
>>> I don't know if VMSD was used cleanly in migration implementation, so
>>> feel free for comments.
>>>
>>> Based on Vladimir's work:
>>> [PATCH v2 00/25] vhost-user-blk: live-backend local migration
>>>     which was based on:
>>>       - [PATCH v4 0/7] chardev: postpone connect
>>>         (which in turn is based on [PATCH 0/2] remove deprecated 'reconnect' options)
>>>       - [PATCH v3 00/23] vhost refactoring and fixes
>>>       - [PATCH v8 14/19] migration: introduce .pre_incoming() vmsd handler
>>>
>>
>> Hi!
>>
>> On my series about backend-transfer migration, the final consensus (or at least,
>> I hope that it's a consensus:) is that using device properties to control migration
>> channel content is wrong. And we should instead use migration parameters.
>>
>> (discussion here: https://lore.kernel.org/qemu-devel/29aa1d66-9fa7-4e44-b0e3-2ca26e77accf@yandex-team.ru/ )
>>
>> So the API for backend-transfer features is a migration parameter
>>
>>      backend-transfer = [ list of QOM paths of devices, for which we want to enable backend-transfer ]
>>
>> and user don't have to change device properties in runtime to setup the following migration.
>>
>> So I assume, similar practice should be applied here: don't use device
>> properties to control migration.
>>
>> So, should it be a parameter like
>>
>>      migrate-inflight-region = [ list of QOM paths of vhost-user devices ]
>>
>> ?
> 
> I have concern that if we start doing this more, migration qapi/ will be
> completely messed up.
> 
> Imagine a world where there'll be tons of lists like:
> 
>    migrate-dev1-some-feature-1 = [list of devices (almost only dev1 typed)]
>    migrate-dev2-some-feature-2 = [list of devices (almost only dev2 typed)]
>    migrate-dev3-some-feature-3 = [list of devices (almost only dev3 typed)]
>    ...


Yes, hard to argue against it.

I still hope, Daniel will share his opinion..

 From our side, we are OK with any interface, which is accepted)


Let me summarize in short the variants I see:

===

1. lists

Add migrations parameters for such features:

migrate-inflight-region = [ list of QOM paths of vhost-user devices ]
backend-transfer = [ list of QOM paths of devices, which backend should be migrated ]

This way, we just need to set the same sets for source and target QEMU before migration,
and it have no relation to machine types.

PROS: Like any other migration-capability, setup what (and how) should migrate, no
relation to device properties and MT.

CONS: Logically, that's the same as add a device property, but instead we implement
lists of devices, and create extra QOM_PATH-links.

===

2. device parameters

Before migration we should loop through devices and call corresponding
qom-set commands, like

qom-set {path: QOM_PATH, property: "backend-transfer", "value": true}
qom-set {path: QOM_PATH, property: "migrate-inflight-region", "value": true}

And of course, we should care to set same values for corresponding devices on source
and target.

In this case, we also _can_ rely on machine types for defaults.

Note, that "migrate-inflight-region" may become the default in the 11.0 MT.
But backend-transfer can't be a default, as this way we'll break remote migration.

PROS: No lists, native properties

CONS: These properties does not define any portion of device state, internal or
visible to guest. It's not a property of device, but it's and option for migration
of that device.

===

2.1 = [2] assisted by one boolean migration-parameter

Still, if we want make backend-transfer "a kind of" default, we'll need one boolean
migration parameter "it-is-local-migration", and modify logic to

really_do_backend_transfer = it-is-local-migration and device.backend-transfer
really_do_migrate_inflight_region = not it-is-local-migration and device.migrate-inflight-region

PROS: starting from some MT, we'll have good defaults, so that user don't have
to enable/disable the option per device for every migration.

CONS: a precedent of the behavior driven by combination of device property and
corresponding migration parameter (or we have something similar?)

===

4. mixed

Keep [2] for this series, and [1] for backend-transfer.

PROS: list for backend-transfer remains "the only exclusion" instead of "the practice",
so we will not have tons of such lists :)

CONS: inconstant solutions for similar things

===

5. implement "per device" migration parameters

They may be set by additional QMP command qmp-migrate-set-device-parameters, which
will take additional qom-path parameter.

Or, we may add one list of structures like

[{
    qom_path: ...
    parameters: ..
}, ...]

into common migration parameters.

PROS: keep new features as a property of migration, but avoid several lists of QOM paths
CONS: ?

Hmm, we also may select devices not only by qom_path, but by type, for example, to enable
feature for all virtio-net devices. Hmm, and this type may be also used as discriminator
for parameters, which may be a QAPI union type..

===


Thoughts?

> 
> That doesn't look reasonable at all.  If some feature is likely only
> supported in one device, that should not appear in migration.json but only
> in the specific device.
> 
> I don't think I'm fully convinced we can't enable some form of machine type
> properties (with QDEV or not) on backends we should stick with something
> like that.  I can have some closer look this week, but.. even if not, I
> still think migration shouldn't care about some specific behavior of a
> specific device.
> 
> If we really want to have some way to probe device features, maybe we
> should also think about a generic interface (rather than "one new list
> every time").  We also have some recent discussions on a proper interface
> to query TAP backend features like USO*.  Maybe they share some of the
> goals here.
> 
What do you mean by probing device features? Isn't it qom-get command?

-- 
Best regards,
Vladimir

Re: [PATCH v3 0/3] vhost-user-blk: support inflight migration

Posted by Peter Xu 2 months ago

On Thu, Dec 04, 2025 at 10:55:33PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> On 19.11.25 01:05, Peter Xu wrote:
> > On Tue, Nov 18, 2025 at 11:24:12PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> > > Add Daniel
> > > 
> > > On 10.11.25 13:39, Alexandr Moshkov wrote:
> > > > v3:
> > > > - use pre_load_errp instead of pre_load in vhost.c
> > > > - change vhost-user-blk property to
> > > >     "skip-get-vring-base-inflight-migration"
> > > > - refactor vhost-user-blk.c, by moving vhost_user_blk_inflight_needed() higher
> > > > 
> > > > v2:
> > > > - rewrite migration using VMSD instead of qemufile API
> > > > - add vhost-user-blk parameter instead of migration capability
> > > > 
> > > > I don't know if VMSD was used cleanly in migration implementation, so
> > > > feel free for comments.
> > > > 
> > > > Based on Vladimir's work:
> > > > [PATCH v2 00/25] vhost-user-blk: live-backend local migration
> > > >     which was based on:
> > > >       - [PATCH v4 0/7] chardev: postpone connect
> > > >         (which in turn is based on [PATCH 0/2] remove deprecated 'reconnect' options)
> > > >       - [PATCH v3 00/23] vhost refactoring and fixes
> > > >       - [PATCH v8 14/19] migration: introduce .pre_incoming() vmsd handler
> > > > 
> > > 
> > > Hi!
> > > 
> > > On my series about backend-transfer migration, the final consensus (or at least,
> > > I hope that it's a consensus:) is that using device properties to control migration
> > > channel content is wrong. And we should instead use migration parameters.
> > > 
> > > (discussion here: https://lore.kernel.org/qemu-devel/29aa1d66-9fa7-4e44-b0e3-2ca26e77accf@yandex-team.ru/ )
> > > 
> > > So the API for backend-transfer features is a migration parameter
> > > 
> > >      backend-transfer = [ list of QOM paths of devices, for which we want to enable backend-transfer ]
> > > 
> > > and user don't have to change device properties in runtime to setup the following migration.
> > > 
> > > So I assume, similar practice should be applied here: don't use device
> > > properties to control migration.
> > > 
> > > So, should it be a parameter like
> > > 
> > >      migrate-inflight-region = [ list of QOM paths of vhost-user devices ]
> > > 
> > > ?
> > 
> > I have concern that if we start doing this more, migration qapi/ will be
> > completely messed up.
> > 
> > Imagine a world where there'll be tons of lists like:
> > 
> >    migrate-dev1-some-feature-1 = [list of devices (almost only dev1 typed)]
> >    migrate-dev2-some-feature-2 = [list of devices (almost only dev2 typed)]
> >    migrate-dev3-some-feature-3 = [list of devices (almost only dev3 typed)]
> >    ...
> 
> 
> Yes, hard to argue against it.
> 
> I still hope, Daniel will share his opinion..
> 
> From our side, we are OK with any interface, which is accepted)
> 
> 
> Let me summarize in short the variants I see:
> 
> ===
> 
> 1. lists
> 
> Add migrations parameters for such features:
> 
> migrate-inflight-region = [ list of QOM paths of vhost-user devices ]
> backend-transfer = [ list of QOM paths of devices, which backend should be migrated ]
> 
> This way, we just need to set the same sets for source and target QEMU before migration,
> and it have no relation to machine types.
> 
> PROS: Like any other migration-capability, setup what (and how) should migrate, no
> relation to device properties and MT.
> 
> CONS: Logically, that's the same as add a device property, but instead we implement
> lists of devices, and create extra QOM_PATH-links.
> 
> ===
> 
> 2. device parameters
> 
> Before migration we should loop through devices and call corresponding
> qom-set commands, like
> 
> qom-set {path: QOM_PATH, property: "backend-transfer", "value": true}
> qom-set {path: QOM_PATH, property: "migrate-inflight-region", "value": true}
> 
> And of course, we should care to set same values for corresponding devices on source
> and target.
> 
> In this case, we also _can_ rely on machine types for defaults.
> 
> Note, that "migrate-inflight-region" may become the default in the 11.0 MT.
> But backend-transfer can't be a default, as this way we'll break remote migration.
> 
> PROS: No lists, native properties
> 
> CONS: These properties does not define any portion of device state, internal or
> visible to guest. It's not a property of device, but it's and option for migration
> of that device.
> 
> ===
> 
> 2.1 = [2] assisted by one boolean migration-parameter
> 
> Still, if we want make backend-transfer "a kind of" default, we'll need one boolean
> migration parameter "it-is-local-migration", and modify logic to
> 
> really_do_backend_transfer = it-is-local-migration and device.backend-transfer
> really_do_migrate_inflight_region = not it-is-local-migration and device.migrate-inflight-region
> 
> PROS: starting from some MT, we'll have good defaults, so that user don't have
> to enable/disable the option per device for every migration.
> 
> CONS: a precedent of the behavior driven by combination of device property and
> corresponding migration parameter (or we have something similar?)
> 
> ===
> 
> 4. mixed
> 
> Keep [2] for this series, and [1] for backend-transfer.
> 
> PROS: list for backend-transfer remains "the only exclusion" instead of "the practice",
> so we will not have tons of such lists :)
> 
> CONS: inconstant solutions for similar things
> 
> ===
> 
> 5. implement "per device" migration parameters
> 
> They may be set by additional QMP command qmp-migrate-set-device-parameters, which
> will take additional qom-path parameter.
> 
> Or, we may add one list of structures like
> 
> [{
>    qom_path: ...
>    parameters: ..
> }, ...]
> 
> into common migration parameters.
> 
> PROS: keep new features as a property of migration, but avoid several lists of QOM paths
> CONS: ?
> 
> Hmm, we also may select devices not only by qom_path, but by type, for example, to enable
> feature for all virtio-net devices. Hmm, and this type may be also used as discriminator
> for parameters, which may be a QAPI union type..
> 
> ===
> 
> 
> Thoughts?

Sorry to respond late, I kept getting other things interrupting me when I
wanted to look at this..

I just sent a series here, allowing TYPE_OBJECT of any kind to be able to
work with machine compat properties:

https://lore.kernel.org/r/20251209162857.857593-1-peterx@redhat.com

I still want to see if we can stick with compat properties in general
whenever it's about defining guest ABI.

What you proposed should work, but that'll involve a 2nd way of probing
"what is the guest ABI" by providing a new QMP query command and then set
them after mgmt queries both QEMUs then set the subset of both.  It will be
finer granule but as I discussed previously, I think it's re-inventing the
wheels, and it may cause mgmt over-bloated on caring too many trivial
details of per-device specific details.

Please have a look to see the feasibility.  As mentioned in the cover
letter, that will need further work to e.g. QOMify TAP first at least for
your series.  But I don't yet see it as a blocker?  After QOMified, it can
inherit directly the OBJECT_COMPAT then TAP can add compat properties.

I wonder if vhost-usr-blk can already use compat properties.

Thanks,

> 
> > 
> > That doesn't look reasonable at all.  If some feature is likely only
> > supported in one device, that should not appear in migration.json but only
> > in the specific device.
> > 
> > I don't think I'm fully convinced we can't enable some form of machine type
> > properties (with QDEV or not) on backends we should stick with something
> > like that.  I can have some closer look this week, but.. even if not, I
> > still think migration shouldn't care about some specific behavior of a
> > specific device.
> > 
> > If we really want to have some way to probe device features, maybe we
> > should also think about a generic interface (rather than "one new list
> > every time").  We also have some recent discussions on a proper interface
> > to query TAP backend features like USO*.  Maybe they share some of the
> > goals here.
> > 
> What do you mean by probing device features? Isn't it qom-get command?
> 
> -- 
> Best regards,
> Vladimir
> 

-- 
Peter Xu

Re: [PATCH v3 0/3] vhost-user-blk: support inflight migration

Posted by Vladimir Sementsov-Ogievskiy 2 months ago

On 09.12.25 19:51, Peter Xu wrote:
> On Thu, Dec 04, 2025 at 10:55:33PM +0300, Vladimir Sementsov-Ogievskiy wrote:
>> On 19.11.25 01:05, Peter Xu wrote:
>>> On Tue, Nov 18, 2025 at 11:24:12PM +0300, Vladimir Sementsov-Ogievskiy wrote:
>>>> Add Daniel
>>>>
>>>> On 10.11.25 13:39, Alexandr Moshkov wrote:
>>>>> v3:
>>>>> - use pre_load_errp instead of pre_load in vhost.c
>>>>> - change vhost-user-blk property to
>>>>>      "skip-get-vring-base-inflight-migration"
>>>>> - refactor vhost-user-blk.c, by moving vhost_user_blk_inflight_needed() higher
>>>>>
>>>>> v2:
>>>>> - rewrite migration using VMSD instead of qemufile API
>>>>> - add vhost-user-blk parameter instead of migration capability
>>>>>
>>>>> I don't know if VMSD was used cleanly in migration implementation, so
>>>>> feel free for comments.
>>>>>
>>>>> Based on Vladimir's work:
>>>>> [PATCH v2 00/25] vhost-user-blk: live-backend local migration
>>>>>      which was based on:
>>>>>        - [PATCH v4 0/7] chardev: postpone connect
>>>>>          (which in turn is based on [PATCH 0/2] remove deprecated 'reconnect' options)
>>>>>        - [PATCH v3 00/23] vhost refactoring and fixes
>>>>>        - [PATCH v8 14/19] migration: introduce .pre_incoming() vmsd handler
>>>>>
>>>>
>>>> Hi!
>>>>
>>>> On my series about backend-transfer migration, the final consensus (or at least,
>>>> I hope that it's a consensus:) is that using device properties to control migration
>>>> channel content is wrong. And we should instead use migration parameters.
>>>>
>>>> (discussion here: https://lore.kernel.org/qemu-devel/29aa1d66-9fa7-4e44-b0e3-2ca26e77accf@yandex-team.ru/ )
>>>>
>>>> So the API for backend-transfer features is a migration parameter
>>>>
>>>>       backend-transfer = [ list of QOM paths of devices, for which we want to enable backend-transfer ]
>>>>
>>>> and user don't have to change device properties in runtime to setup the following migration.
>>>>
>>>> So I assume, similar practice should be applied here: don't use device
>>>> properties to control migration.
>>>>
>>>> So, should it be a parameter like
>>>>
>>>>       migrate-inflight-region = [ list of QOM paths of vhost-user devices ]
>>>>
>>>> ?
>>>
>>> I have concern that if we start doing this more, migration qapi/ will be
>>> completely messed up.
>>>
>>> Imagine a world where there'll be tons of lists like:
>>>
>>>     migrate-dev1-some-feature-1 = [list of devices (almost only dev1 typed)]
>>>     migrate-dev2-some-feature-2 = [list of devices (almost only dev2 typed)]
>>>     migrate-dev3-some-feature-3 = [list of devices (almost only dev3 typed)]
>>>     ...
>>
>>
>> Yes, hard to argue against it.
>>
>> I still hope, Daniel will share his opinion..
>>
>>  From our side, we are OK with any interface, which is accepted)
>>
>>
>> Let me summarize in short the variants I see:
>>
>> ===
>>
>> 1. lists
>>
>> Add migrations parameters for such features:
>>
>> migrate-inflight-region = [ list of QOM paths of vhost-user devices ]
>> backend-transfer = [ list of QOM paths of devices, which backend should be migrated ]
>>
>> This way, we just need to set the same sets for source and target QEMU before migration,
>> and it have no relation to machine types.
>>
>> PROS: Like any other migration-capability, setup what (and how) should migrate, no
>> relation to device properties and MT.
>>
>> CONS: Logically, that's the same as add a device property, but instead we implement
>> lists of devices, and create extra QOM_PATH-links.
>>
>> ===
>>
>> 2. device parameters
>>
>> Before migration we should loop through devices and call corresponding
>> qom-set commands, like
>>
>> qom-set {path: QOM_PATH, property: "backend-transfer", "value": true}
>> qom-set {path: QOM_PATH, property: "migrate-inflight-region", "value": true}
>>
>> And of course, we should care to set same values for corresponding devices on source
>> and target.
>>
>> In this case, we also _can_ rely on machine types for defaults.
>>
>> Note, that "migrate-inflight-region" may become the default in the 11.0 MT.
>> But backend-transfer can't be a default, as this way we'll break remote migration.
>>
>> PROS: No lists, native properties
>>
>> CONS: These properties does not define any portion of device state, internal or
>> visible to guest. It's not a property of device, but it's and option for migration
>> of that device.
>>
>> ===
>>
>> 2.1 = [2] assisted by one boolean migration-parameter
>>
>> Still, if we want make backend-transfer "a kind of" default, we'll need one boolean
>> migration parameter "it-is-local-migration", and modify logic to
>>
>> really_do_backend_transfer = it-is-local-migration and device.backend-transfer
>> really_do_migrate_inflight_region = not it-is-local-migration and device.migrate-inflight-region
>>
>> PROS: starting from some MT, we'll have good defaults, so that user don't have
>> to enable/disable the option per device for every migration.
>>
>> CONS: a precedent of the behavior driven by combination of device property and
>> corresponding migration parameter (or we have something similar?)
>>
>> ===
>>
>> 4. mixed
>>
>> Keep [2] for this series, and [1] for backend-transfer.
>>
>> PROS: list for backend-transfer remains "the only exclusion" instead of "the practice",
>> so we will not have tons of such lists :)
>>
>> CONS: inconstant solutions for similar things
>>
>> ===
>>
>> 5. implement "per device" migration parameters
>>
>> They may be set by additional QMP command qmp-migrate-set-device-parameters, which
>> will take additional qom-path parameter.
>>
>> Or, we may add one list of structures like
>>
>> [{
>>     qom_path: ...
>>     parameters: ..
>> }, ...]
>>
>> into common migration parameters.
>>
>> PROS: keep new features as a property of migration, but avoid several lists of QOM paths
>> CONS: ?
>>
>> Hmm, we also may select devices not only by qom_path, but by type, for example, to enable
>> feature for all virtio-net devices. Hmm, and this type may be also used as discriminator
>> for parameters, which may be a QAPI union type..
>>
>> ===
>>
>>
>> Thoughts?
> 
> Sorry to respond late, I kept getting other things interrupting me when I
> wanted to look at this..
> 
> I just sent a series here, allowing TYPE_OBJECT of any kind to be able to
> work with machine compat properties:
> 
> https://lore.kernel.org/r/20251209162857.857593-1-peterx@redhat.com
> 
> I still want to see if we can stick with compat properties in general
> whenever it's about defining guest ABI.
> 
> What you proposed should work, but that'll involve a 2nd way of probing
> "what is the guest ABI" by providing a new QMP query command and then set
> them after mgmt queries both QEMUs then set the subset of both.  It will be
> finer granule but as I discussed previously, I think it's re-inventing the
> wheels, and it may cause mgmt over-bloated on caring too many trivial
> details of per-device specific details.
> 
> Please have a look to see the feasibility.  As mentioned in the cover
> letter, that will need further work to e.g. QOMify TAP first at least for
> your series.  But I don't yet see it as a blocker?  After QOMified, it can
> inherit directly the OBJECT_COMPAT then TAP can add compat properties.
> 
> I wonder if vhost-usr-blk can already use compat properties.
> 

Yes, it can. And regardless of the way we chose: qdev properties or qapi,
I don't think we need a property for backend itself. We need a property
(or migration capability) for vhost-user-blk itself, saying that its
backend should be migrated.

It's a lot simpler to migrate backend inside of frontend state. If we
migrate backend in separate, we can't control the order of backend/frontend
stats, and will have to implement some late point in state load process,
where both are already loaded and we can do our post-load logic.

> 
>>
>>>
>>> That doesn't look reasonable at all.  If some feature is likely only
>>> supported in one device, that should not appear in migration.json but only
>>> in the specific device.
>>>
>>> I don't think I'm fully convinced we can't enable some form of machine type
>>> properties (with QDEV or not) on backends we should stick with something
>>> like that.  I can have some closer look this week, but.. even if not, I
>>> still think migration shouldn't care about some specific behavior of a
>>> specific device.
>>>
>>> If we really want to have some way to probe device features, maybe we
>>> should also think about a generic interface (rather than "one new list
>>> every time").  We also have some recent discussions on a proper interface
>>> to query TAP backend features like USO*.  Maybe they share some of the
>>> goals here.
>>>
>> What do you mean by probing device features? Isn't it qom-get command?
>>
>> -- 
>> Best regards,
>> Vladimir
>>
> 


-- 
Best regards,
Vladimir

Re: [PATCH v3 0/3] vhost-user-blk: support inflight migration

Posted by Peter Xu 2 months ago

On Wed, Dec 10, 2025 at 02:41:20PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> Yes, it can. And regardless of the way we chose: qdev properties or qapi,
> I don't think we need a property for backend itself. We need a property
> (or migration capability) for vhost-user-blk itself, saying that its
> backend should be migrated.

The problem is then we need to introduce the new property to all frontends
that would support the backend?  If it's a backend property, it can be one
property for the backend that all the frontends can consume.

> 
> It's a lot simpler to migrate backend inside of frontend state. If we
> migrate backend in separate, we can't control the order of backend/frontend
> stats, and will have to implement some late point in state load process,
> where both are already loaded and we can do our post-load logic.

Would MigrationPriority help when defining the VMSD?

Thanks,

-- 
Peter Xu

Re: [PATCH v3 0/3] vhost-user-blk: support inflight migration

Posted by Vladimir Sementsov-Ogievskiy 2 months ago

On 10.12.25 18:20, Peter Xu wrote:
> On Wed, Dec 10, 2025 at 02:41:20PM +0300, Vladimir Sementsov-Ogievskiy wrote:
>> Yes, it can. And regardless of the way we chose: qdev properties or qapi,
>> I don't think we need a property for backend itself. We need a property
>> (or migration capability) for vhost-user-blk itself, saying that its
>> backend should be migrated.
> 
> The problem is then we need to introduce the new property to all frontends
> that would support the backend?  If it's a backend property, it can be one
> property for the backend that all the frontends can consume.
> 

Hmm, agree, that's right.. So, we may not touch frontend at all, and only
setup backend to be migrated. And this remains transparent for frontend side.

>>
>> It's a lot simpler to migrate backend inside of frontend state. If we
>> migrate backend in separate, we can't control the order of backend/frontend
>> stats, and will have to implement some late point in state load process,
>> where both are already loaded and we can do our post-load logic.
> 
> Would MigrationPriority help when defining the VMSD?
> 

Didn't know about it. Most probably it may help, we just setup so that backends
migrate before frontends.


-- 
Best regards,
Vladimir