[PATCH 0/2] vhost-user-blk: support inflight migration

Alexandr Moshkov posted 2 patches 3 weeks, 4 days ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20251020054413.2614932-1-dtalexundeer@yandex-team.ru
Maintainers: Raphael Norwitz <raphael@enfabrica.net>, "Michael S. Tsirkin" <mst@redhat.com>, Stefano Garzarella <sgarzare@redhat.com>, Kevin Wolf <kwolf@redhat.com>, Hanna Reitz <hreitz@redhat.com>, Peter Xu <peterx@redhat.com>, Fabiano Rosas <farosas@suse.de>, Eric Blake <eblake@redhat.com>, Markus Armbruster <armbru@redhat.com>
There is a newer version of this series
hw/block/vhost-user-blk.c | 52 ++++++++++++++++++++++++++++++++++++
hw/virtio/vhost.c         | 56 +++++++++++++++++++++++++++++++++++++++
include/hw/virtio/vhost.h |  2 ++
migration/options.c       |  7 +++++
migration/options.h       |  1 +
qapi/migration.json       |  9 +++++--
6 files changed, 125 insertions(+), 2 deletions(-)
[PATCH 0/2] vhost-user-blk: support inflight migration
Posted by Alexandr Moshkov 3 weeks, 4 days ago
Hi!

During inter-host migration, waiting for disk requests to be drained
in the vhost-user backend can incur significant downtime.

This can be avoided if QEMU migrates the inflight region in vhost-user-blk. 
Thus, during the qemu migration, the vhost-user backend can cancel all inflight requests and
then, after migration, they will be executed on another host. 

At first, I tried to implement migration for all vhost-user devices that support inflight at once, 
but this would require a lot of changes both in vhost-user-blk (to transfer it to the base class) and 
in the vhost-user-base base class (inflight implementation and remodeling + a large refactor).

Therefore, for now I decided to leave this idea for later and 
implement the migration of the inflight region first for vhost-user-blk.

Alexandr Moshkov (2):
  vhost: support inflight save/load
  vhost-user-blk: support inflight migration

 hw/block/vhost-user-blk.c | 52 ++++++++++++++++++++++++++++++++++++
 hw/virtio/vhost.c         | 56 +++++++++++++++++++++++++++++++++++++++
 include/hw/virtio/vhost.h |  2 ++
 migration/options.c       |  7 +++++
 migration/options.h       |  1 +
 qapi/migration.json       |  9 +++++--
 6 files changed, 125 insertions(+), 2 deletions(-)

-- 
2.34.1
Re: [PATCH 0/2] vhost-user-blk: support inflight migration
Posted by Raphael Norwitz 3 weeks, 2 days ago
The logic looks ok from the vhost-user-blk side but some comments inline.

On Mon, Oct 20, 2025 at 1:47 AM Alexandr Moshkov
<dtalexundeer@yandex-team.ru> wrote:
>
> Hi!
>
> During inter-host migration, waiting for disk requests to be drained
> in the vhost-user backend can incur significant downtime.
>
> This can be avoided if QEMU migrates the inflight region in vhost-user-blk.
> Thus, during the qemu migration, the vhost-user backend can cancel all inflight requests and
> then, after migration, they will be executed on another host.
>
> At first, I tried to implement migration for all vhost-user devices that support inflight at once,
> but this would require a lot of changes both in vhost-user-blk (to transfer it to the base class) and
> in the vhost-user-base base class (inflight implementation and remodeling + a large refactor).
>

Even if it's a more significant change I'd rather generalize as much
logic as possible and expose it as a vhost-user protocol feature. IMO
too much vhost-user device-agnositic code is being pushed into
vhost-user-blk.

As Markus noted this also conflicts significantly with Vladimir's
series so I'd suggest waiting until those are in, or possibly
attempting to generalize on top of his changes.

> Therefore, for now I decided to leave this idea for later and
> implement the migration of the inflight region first for vhost-user-blk.
>
> Alexandr Moshkov (2):
>   vhost: support inflight save/load
>   vhost-user-blk: support inflight migration
>
>  hw/block/vhost-user-blk.c | 52 ++++++++++++++++++++++++++++++++++++
>  hw/virtio/vhost.c         | 56 +++++++++++++++++++++++++++++++++++++++
>  include/hw/virtio/vhost.h |  2 ++
>  migration/options.c       |  7 +++++
>  migration/options.h       |  1 +
>  qapi/migration.json       |  9 +++++--
>  6 files changed, 125 insertions(+), 2 deletions(-)
>
> --
> 2.34.1
>
>
Re: [PATCH 0/2] vhost-user-blk: support inflight migration
Posted by Alexandr Moshkov 3 weeks, 2 days ago
Hi!

On 10/21/25 23:54, Raphael Norwitz wrote:

> The logic looks ok from the vhost-user-blk side but some comments inline.
>
> On Mon, Oct 20, 2025 at 1:47 AM Alexandr Moshkov
> <dtalexundeer@yandex-team.ru> wrote:
>> Hi!
>>
>> During inter-host migration, waiting for disk requests to be drained
>> in the vhost-user backend can incur significant downtime.
>>
>> This can be avoided if QEMU migrates the inflight region in vhost-user-blk.
>> Thus, during the qemu migration, the vhost-user backend can cancel all inflight requests and
>> then, after migration, they will be executed on another host.
>>
>> At first, I tried to implement migration for all vhost-user devices that support inflight at once,
>> but this would require a lot of changes both in vhost-user-blk (to transfer it to the base class) and
>> in the vhost-user-base base class (inflight implementation and remodeling + a large refactor).
>>
> Even if it's a more significant change I'd rather generalize as much
> logic as possible and expose it as a vhost-user protocol feature. IMO
> too much vhost-user device-agnositic code is being pushed into
> vhost-user-blk.

As far as I understand (correct me if I'm wrong), but this feature must 
be implemented in device itself (along with inflight field location) or 
in some base class for vhost-user devices. For now vhost-user-blkdoesn't 
have one yet.The closest base class that i could find is 
vhost-user-base, but for using it, it has to implement all 
device-agnostic code from vhost-user-blk, and don't break all existing 
vhost-user-base derived devices. For example, to support inflight 
migration in base class, there first need to implement inflight field in 
it, along with the reconnect feature. Then, with a big refactor, somehow 
inherit vhost-user-blk from it, along with other devices such as 
vhost-user-fs.

So, IMO it looks like a whole separate track that will need to be tackled.

> As Markus noted this also conflicts significantly with Vladimir's
> series so I'd suggest waiting until those are in, or possibly
> attempting to generalize on top of his changes.

Yea, but i think, i just need to perform inflight migration only when 
some non-local migration flag is set.

And i think, Vladimir's series already have that flag.


Thanks for the comments!
Re: [PATCH 0/2] vhost-user-blk: support inflight migration
Posted by Raphael Norwitz 3 weeks, 1 day ago
On Wed, Oct 22, 2025 at 3:59 AM Alexandr Moshkov
<dtalexundeer@yandex-team.ru> wrote:
>
> Hi!
>
> On 10/21/25 23:54, Raphael Norwitz wrote:
>
> The logic looks ok from the vhost-user-blk side but some comments inline.
>
> On Mon, Oct 20, 2025 at 1:47 AM Alexandr Moshkov
> <dtalexundeer@yandex-team.ru> wrote:
>
> Hi!
>
> During inter-host migration, waiting for disk requests to be drained
> in the vhost-user backend can incur significant downtime.
>
> This can be avoided if QEMU migrates the inflight region in vhost-user-blk.
> Thus, during the qemu migration, the vhost-user backend can cancel all inflight requests and
> then, after migration, they will be executed on another host.
>
> At first, I tried to implement migration for all vhost-user devices that support inflight at once,
> but this would require a lot of changes both in vhost-user-blk (to transfer it to the base class) and
> in the vhost-user-base base class (inflight implementation and remodeling + a large refactor).
>
> Even if it's a more significant change I'd rather generalize as much
> logic as possible and expose it as a vhost-user protocol feature. IMO
> too much vhost-user device-agnositic code is being pushed into
> vhost-user-blk.
>
> As far as I understand (correct me if I'm wrong), but this feature must be implemented in device itself (along with inflight field location) or in some base class for vhost-user devices. For now vhost-user-blk doesn't have one yet. The closest base class that i could find is vhost-user-base, but for using it, it has to implement all device-agnostic code from vhost-user-blk, and don't break all existing vhost-user-base derived devices. For example, to support inflight migration in base class, there first need to implement inflight field in it, along with the reconnect feature. Then, with a big refactor, somehow inherit vhost-user-blk from it, along with other devices such as vhost-user-fs.
>
> So, IMO it looks like a whole separate track that will need to be tackled.
>

I get that inflight is inside of struct VhostUserBlk, not in struct
vhost_user. Even if not all device types support inflight I think that
could probably be changed? If we do keep things as you have it here
patch 1 should probably bring the vhost_dev_load_inflight() and
vhost_dev_save_inflight() into vhost_user.c rather than vhost.c since
only vhost_user devices seem to use inflight.

> As Markus noted this also conflicts significantly with Vladimir's
> series so I'd suggest waiting until those are in, or possibly
> attempting to generalize on top of his changes.
>
> Yea, but i think, i just need to perform inflight migration only when some non-local migration flag is set.
>

I'm not sure about that. While it may be completely safe in your case
to avoid GET_VRING_BASE/quiescing the backend, it could introduce
corruption cases on other vhost-user-blk storage backends which work
differently. I think this would require an "opt-in" migration
parameter over and above checking that it's not a local migration.

Ideally there would be some way to have the vhost-user backends expose
support per device with a protocol feature and, if the source and
destination both advertise support, we perform the inflight migration
skipping GET_VRING_BASE. I don't know enough about the migration code
to know how that would be done or if it is practical.

Also (and I think this also applies to Vladimir's series), it should
be possible to add multiple vhost-user-blk devices to the same Qemu
instance but have the devices backed by very different backend
implementations. What happens if some of those backend implementations
support things like lightweight local migration and/or inflight
migration but others don't? Ideally you would want the API to be able
to select target vhost-user-blk devices for accelerated migration
rather than assuming they will all want to use the same features. I
don't know if anyone is doing that or if we care about such cases but
I just want to throw the concern out there.

> And i think, Vladimir's series already have that flag.
>
>
> Thanks for the comments!
Re: [PATCH 0/2] vhost-user-blk: support inflight migration
Posted by Alexandr Moshkov 3 weeks, 1 day ago
On 10/23/25 02:33, Raphael Norwitz wrote:
> On Wed, Oct 22, 2025 at 3:59 AM Alexandr Moshkov
> <dtalexundeer@yandex-team.ru> wrote:
>> Hi!
>>
>> On 10/21/25 23:54, Raphael Norwitz wrote:
>>
>> The logic looks ok from the vhost-user-blk side but some comments inline.
>>
>> On Mon, Oct 20, 2025 at 1:47 AM Alexandr Moshkov
>> <dtalexundeer@yandex-team.ru> wrote:
>>
>> Hi!
>>
>> During inter-host migration, waiting for disk requests to be drained
>> in the vhost-user backend can incur significant downtime.
>>
>> This can be avoided if QEMU migrates the inflight region in vhost-user-blk.
>> Thus, during the qemu migration, the vhost-user backend can cancel all inflight requests and
>> then, after migration, they will be executed on another host.
>>
>> At first, I tried to implement migration for all vhost-user devices that support inflight at once,
>> but this would require a lot of changes both in vhost-user-blk (to transfer it to the base class) and
>> in the vhost-user-base base class (inflight implementation and remodeling + a large refactor).
>>
>> Even if it's a more significant change I'd rather generalize as much
>> logic as possible and expose it as a vhost-user protocol feature. IMO
>> too much vhost-user device-agnositic code is being pushed into
>> vhost-user-blk.
>>
>> As far as I understand (correct me if I'm wrong), but this feature must be implemented in device itself (along with inflight field location) or in some base class for vhost-user devices. For now vhost-user-blk doesn't have one yet. The closest base class that i could find is vhost-user-base, but for using it, it has to implement all device-agnostic code from vhost-user-blk, and don't break all existing vhost-user-base derived devices. For example, to support inflight migration in base class, there first need to implement inflight field in it, along with the reconnect feature. Then, with a big refactor, somehow inherit vhost-user-blk from it, along with other devices such as vhost-user-fs.
>>
>> So, IMO it looks like a whole separate track that will need to be tackled.
>>
> I get that inflight is inside of struct VhostUserBlk, not in struct
> vhost_user. Even if not all device types support inflight I think that
> could probably be changed? If we do keep things as you have it here
> patch 1 should probably bring the vhost_dev_load_inflight() and
> vhost_dev_save_inflight() into vhost_user.c rather than vhost.c since
> only vhost_user devices seem to use inflight.

Firstly, I tried to implement inflight in the vhost_user structure, but 
this structure is initialized on every vhost_user_blk_connect() and then 
destroyed on every vhost_user_blk_disconnect(). But, the inflight region 
must survive device disconnect.

During active migration, the vhost_user struct will be destroyed before 
the inflight saving.

And you are right about the location of the functions in patch 1. I will 
move them to the vhost-user.c file, which will be more appropriate. Thanks!

>> As Markus noted this also conflicts significantly with Vladimir's
>> series so I'd suggest waiting until those are in, or possibly
>> attempting to generalize on top of his changes.
>>
>> Yea, but i think, i just need to perform inflight migration only when some non-local migration flag is set.
>>
> I'm not sure about that. While it may be completely safe in your case
> to avoid GET_VRING_BASE/quiescing the backend, it could introduce
> corruption cases on other vhost-user-blk storage backends which work
> differently. I think this would require an "opt-in" migration
> parameter over and above checking that it's not a local migration.
In patch 2 I introduce migration capability for that. Or you think that 
we need something like per-device parameter?
>
> Ideally there would be some way to have the vhost-user backends expose
> support per device with a protocol feature and, if the source and
> destination both advertise support, we perform the inflight migration
> skipping GET_VRING_BASE. I don't know enough about the migration code
> to know how that would be done or if it is practical.
>
> Also (and I think this also applies to Vladimir's series), it should
> be possible to add multiple vhost-user-blk devices to the same Qemu
> instance but have the devices backed by very different backend
> implementations. What happens if some of those backend implementations
> support things like lightweight local migration and/or inflight
> migration but others don't? Ideally you would want the API to be able
> to select target vhost-user-blk devices for accelerated migration
> rather than assuming they will all want to use the same features. I
> don't know if anyone is doing that or if we care about such cases but
> I just want to throw the concern out there.

Hmm, that looks like we need some per-device parameter in order to set 
this feature on for them.

I guess we need a opinion from migration maintainer about such cases
Re: [PATCH 0/2] vhost-user-blk: support inflight migration
Posted by Raphael Norwitz 3 weeks ago
On Thu, Oct 23, 2025 at 2:44 AM Alexandr Moshkov
<dtalexundeer@yandex-team.ru> wrote:
>
>
> On 10/23/25 02:33, Raphael Norwitz wrote:
>
> On Wed, Oct 22, 2025 at 3:59 AM Alexandr Moshkov
> <dtalexundeer@yandex-team.ru> wrote:
>
> Hi!
>
> On 10/21/25 23:54, Raphael Norwitz wrote:
>
> The logic looks ok from the vhost-user-blk side but some comments inline.
>
> On Mon, Oct 20, 2025 at 1:47 AM Alexandr Moshkov
> <dtalexundeer@yandex-team.ru> wrote:
>
> Hi!
>
> During inter-host migration, waiting for disk requests to be drained
> in the vhost-user backend can incur significant downtime.
>
> This can be avoided if QEMU migrates the inflight region in vhost-user-blk.
> Thus, during the qemu migration, the vhost-user backend can cancel all inflight requests and
> then, after migration, they will be executed on another host.
>
> At first, I tried to implement migration for all vhost-user devices that support inflight at once,
> but this would require a lot of changes both in vhost-user-blk (to transfer it to the base class) and
> in the vhost-user-base base class (inflight implementation and remodeling + a large refactor).
>
> Even if it's a more significant change I'd rather generalize as much
> logic as possible and expose it as a vhost-user protocol feature. IMO
> too much vhost-user device-agnositic code is being pushed into
> vhost-user-blk.
>
> As far as I understand (correct me if I'm wrong), but this feature must be implemented in device itself (along with inflight field location) or in some base class for vhost-user devices. For now vhost-user-blk doesn't have one yet. The closest base class that i could find is vhost-user-base, but for using it, it has to implement all device-agnostic code from vhost-user-blk, and don't break all existing vhost-user-base derived devices. For example, to support inflight migration in base class, there first need to implement inflight field in it, along with the reconnect feature. Then, with a big refactor, somehow inherit vhost-user-blk from it, along with other devices such as vhost-user-fs.
>
> So, IMO it looks like a whole separate track that will need to be tackled.
>
> I get that inflight is inside of struct VhostUserBlk, not in struct
> vhost_user. Even if not all device types support inflight I think that
> could probably be changed? If we do keep things as you have it here
> patch 1 should probably bring the vhost_dev_load_inflight() and
> vhost_dev_save_inflight() into vhost_user.c rather than vhost.c since
> only vhost_user devices seem to use inflight.
>
> Firstly, I tried to implement inflight in the vhost_user structure, but this structure is initialized on every vhost_user_blk_connect() and then destroyed on every vhost_user_blk_disconnect(). But, the inflight region must survive device disconnect.
>
> During active migration, the vhost_user struct will be destroyed before the inflight saving.
>

In struct vhost we should probably add some data which persists
between reconnects. I take your point that we cannot generalize
inflight migration without doing that first.

> And you are right about the location of the functions in patch 1. I will move them to the vhost-user.c file, which will be more appropriate. Thanks!
>
> As Markus noted this also conflicts significantly with Vladimir's
> series so I'd suggest waiting until those are in, or possibly
> attempting to generalize on top of his changes.
>
> Yea, but i think, i just need to perform inflight migration only when some non-local migration flag is set.
>
> I'm not sure about that. While it may be completely safe in your case
> to avoid GET_VRING_BASE/quiescing the backend, it could introduce
> corruption cases on other vhost-user-blk storage backends which work
> differently. I think this would require an "opt-in" migration
> parameter over and above checking that it's not a local migration.
>
> In patch 2 I introduce migration capability for that. Or you think that we need something like per-device parameter?
>

My point here was that you will definitely need a parameter you have.
I'll leave it to others to say if we actually need a per-device
parameter.

> Ideally there would be some way to have the vhost-user backends expose
> support per device with a protocol feature and, if the source and
> destination both advertise support, we perform the inflight migration
> skipping GET_VRING_BASE. I don't know enough about the migration code
> to know how that would be done or if it is practical.
>
> Also (and I think this also applies to Vladimir's series), it should
> be possible to add multiple vhost-user-blk devices to the same Qemu
> instance but have the devices backed by very different backend
> implementations. What happens if some of those backend implementations
> support things like lightweight local migration and/or inflight
> migration but others don't? Ideally you would want the API to be able
> to select target vhost-user-blk devices for accelerated migration
> rather than assuming they will all want to use the same features. I
> don't know if anyone is doing that or if we care about such cases but
> I just want to throw the concern out there.
>
> Hmm, that looks like we need some per-device parameter in order to set this feature on for them.
>
> I guess we need a opinion from migration maintainer about such cases
Re: [PATCH 0/2] vhost-user-blk: support inflight migration
Posted by Markus Armbruster 3 weeks, 4 days ago
Alexandr Moshkov <dtalexundeer@yandex-team.ru> writes:

> Hi!
>
> During inter-host migration, waiting for disk requests to be drained
> in the vhost-user backend can incur significant downtime.
>
> This can be avoided if QEMU migrates the inflight region in vhost-user-blk. 
> Thus, during the qemu migration, the vhost-user backend can cancel all inflight requests and
> then, after migration, they will be executed on another host. 
>
> At first, I tried to implement migration for all vhost-user devices that support inflight at once, 
> but this would require a lot of changes both in vhost-user-blk (to transfer it to the base class) and 
> in the vhost-user-base base class (inflight implementation and remodeling + a large refactor).
>
> Therefore, for now I decided to leave this idea for later and 
> implement the migration of the inflight region first for vhost-user-blk.

How is this work related to Vladimir's "vhost-user-blk: live-backend
local migration"?
Re: [PATCH 0/2] vhost-user-blk: support inflight migration
Posted by Alexandr Moshkov 3 weeks, 4 days ago
On 10/20/25 14:55, Markus Armbruster wrote:
> Alexandr Moshkov<dtalexundeer@yandex-team.ru> writes:
>
>> Hi!
>>
>> During inter-host migration, waiting for disk requests to be drained
>> in the vhost-user backend can incur significant downtime.
>>
>> This can be avoided if QEMU migrates the inflight region in vhost-user-blk.
>> Thus, during the qemu migration, the vhost-user backend can cancel all inflight requests and
>> then, after migration, they will be executed on another host.
>>
>> At first, I tried to implement migration for all vhost-user devices that support inflight at once,
>> but this would require a lot of changes both in vhost-user-blk (to transfer it to the base class) and
>> in the vhost-user-base base class (inflight implementation and remodeling + a large refactor).
>>
>> Therefore, for now I decided to leave this idea for later and
>> implement the migration of the inflight region first for vhost-user-blk.
> How is this work related to Vladimir's "vhost-user-blk: live-backend
> local migration"?

Hi!

Vladimir's work is only related to local migration (including inflight 
region fd), when my patch allows to migrate inflight requests to another 
host to reduce downtime without waiting for disk requests to be drained.