hw/block/vhost-user-blk.c | 52 ++++++++++++++++++++++++++++++++++++ hw/virtio/vhost.c | 56 +++++++++++++++++++++++++++++++++++++++ include/hw/virtio/vhost.h | 2 ++ migration/options.c | 7 +++++ migration/options.h | 1 + qapi/migration.json | 9 +++++-- 6 files changed, 125 insertions(+), 2 deletions(-)
Hi! During inter-host migration, waiting for disk requests to be drained in the vhost-user backend can incur significant downtime. This can be avoided if QEMU migrates the inflight region in vhost-user-blk. Thus, during the qemu migration, the vhost-user backend can cancel all inflight requests and then, after migration, they will be executed on another host. At first, I tried to implement migration for all vhost-user devices that support inflight at once, but this would require a lot of changes both in vhost-user-blk (to transfer it to the base class) and in the vhost-user-base base class (inflight implementation and remodeling + a large refactor). Therefore, for now I decided to leave this idea for later and implement the migration of the inflight region first for vhost-user-blk. Alexandr Moshkov (2): vhost: support inflight save/load vhost-user-blk: support inflight migration hw/block/vhost-user-blk.c | 52 ++++++++++++++++++++++++++++++++++++ hw/virtio/vhost.c | 56 +++++++++++++++++++++++++++++++++++++++ include/hw/virtio/vhost.h | 2 ++ migration/options.c | 7 +++++ migration/options.h | 1 + qapi/migration.json | 9 +++++-- 6 files changed, 125 insertions(+), 2 deletions(-) -- 2.34.1
The logic looks ok from the vhost-user-blk side but some comments inline. On Mon, Oct 20, 2025 at 1:47 AM Alexandr Moshkov <dtalexundeer@yandex-team.ru> wrote: > > Hi! > > During inter-host migration, waiting for disk requests to be drained > in the vhost-user backend can incur significant downtime. > > This can be avoided if QEMU migrates the inflight region in vhost-user-blk. > Thus, during the qemu migration, the vhost-user backend can cancel all inflight requests and > then, after migration, they will be executed on another host. > > At first, I tried to implement migration for all vhost-user devices that support inflight at once, > but this would require a lot of changes both in vhost-user-blk (to transfer it to the base class) and > in the vhost-user-base base class (inflight implementation and remodeling + a large refactor). > Even if it's a more significant change I'd rather generalize as much logic as possible and expose it as a vhost-user protocol feature. IMO too much vhost-user device-agnositic code is being pushed into vhost-user-blk. As Markus noted this also conflicts significantly with Vladimir's series so I'd suggest waiting until those are in, or possibly attempting to generalize on top of his changes. > Therefore, for now I decided to leave this idea for later and > implement the migration of the inflight region first for vhost-user-blk. > > Alexandr Moshkov (2): > vhost: support inflight save/load > vhost-user-blk: support inflight migration > > hw/block/vhost-user-blk.c | 52 ++++++++++++++++++++++++++++++++++++ > hw/virtio/vhost.c | 56 +++++++++++++++++++++++++++++++++++++++ > include/hw/virtio/vhost.h | 2 ++ > migration/options.c | 7 +++++ > migration/options.h | 1 + > qapi/migration.json | 9 +++++-- > 6 files changed, 125 insertions(+), 2 deletions(-) > > -- > 2.34.1 > >
Hi! On 10/21/25 23:54, Raphael Norwitz wrote: > The logic looks ok from the vhost-user-blk side but some comments inline. > > On Mon, Oct 20, 2025 at 1:47 AM Alexandr Moshkov > <dtalexundeer@yandex-team.ru> wrote: >> Hi! >> >> During inter-host migration, waiting for disk requests to be drained >> in the vhost-user backend can incur significant downtime. >> >> This can be avoided if QEMU migrates the inflight region in vhost-user-blk. >> Thus, during the qemu migration, the vhost-user backend can cancel all inflight requests and >> then, after migration, they will be executed on another host. >> >> At first, I tried to implement migration for all vhost-user devices that support inflight at once, >> but this would require a lot of changes both in vhost-user-blk (to transfer it to the base class) and >> in the vhost-user-base base class (inflight implementation and remodeling + a large refactor). >> > Even if it's a more significant change I'd rather generalize as much > logic as possible and expose it as a vhost-user protocol feature. IMO > too much vhost-user device-agnositic code is being pushed into > vhost-user-blk. As far as I understand (correct me if I'm wrong), but this feature must be implemented in device itself (along with inflight field location) or in some base class for vhost-user devices. For now vhost-user-blkdoesn't have one yet.The closest base class that i could find is vhost-user-base, but for using it, it has to implement all device-agnostic code from vhost-user-blk, and don't break all existing vhost-user-base derived devices. For example, to support inflight migration in base class, there first need to implement inflight field in it, along with the reconnect feature. Then, with a big refactor, somehow inherit vhost-user-blk from it, along with other devices such as vhost-user-fs. So, IMO it looks like a whole separate track that will need to be tackled. > As Markus noted this also conflicts significantly with Vladimir's > series so I'd suggest waiting until those are in, or possibly > attempting to generalize on top of his changes. Yea, but i think, i just need to perform inflight migration only when some non-local migration flag is set. And i think, Vladimir's series already have that flag. Thanks for the comments!
On Wed, Oct 22, 2025 at 3:59 AM Alexandr Moshkov <dtalexundeer@yandex-team.ru> wrote: > > Hi! > > On 10/21/25 23:54, Raphael Norwitz wrote: > > The logic looks ok from the vhost-user-blk side but some comments inline. > > On Mon, Oct 20, 2025 at 1:47 AM Alexandr Moshkov > <dtalexundeer@yandex-team.ru> wrote: > > Hi! > > During inter-host migration, waiting for disk requests to be drained > in the vhost-user backend can incur significant downtime. > > This can be avoided if QEMU migrates the inflight region in vhost-user-blk. > Thus, during the qemu migration, the vhost-user backend can cancel all inflight requests and > then, after migration, they will be executed on another host. > > At first, I tried to implement migration for all vhost-user devices that support inflight at once, > but this would require a lot of changes both in vhost-user-blk (to transfer it to the base class) and > in the vhost-user-base base class (inflight implementation and remodeling + a large refactor). > > Even if it's a more significant change I'd rather generalize as much > logic as possible and expose it as a vhost-user protocol feature. IMO > too much vhost-user device-agnositic code is being pushed into > vhost-user-blk. > > As far as I understand (correct me if I'm wrong), but this feature must be implemented in device itself (along with inflight field location) or in some base class for vhost-user devices. For now vhost-user-blk doesn't have one yet. The closest base class that i could find is vhost-user-base, but for using it, it has to implement all device-agnostic code from vhost-user-blk, and don't break all existing vhost-user-base derived devices. For example, to support inflight migration in base class, there first need to implement inflight field in it, along with the reconnect feature. Then, with a big refactor, somehow inherit vhost-user-blk from it, along with other devices such as vhost-user-fs. > > So, IMO it looks like a whole separate track that will need to be tackled. > I get that inflight is inside of struct VhostUserBlk, not in struct vhost_user. Even if not all device types support inflight I think that could probably be changed? If we do keep things as you have it here patch 1 should probably bring the vhost_dev_load_inflight() and vhost_dev_save_inflight() into vhost_user.c rather than vhost.c since only vhost_user devices seem to use inflight. > As Markus noted this also conflicts significantly with Vladimir's > series so I'd suggest waiting until those are in, or possibly > attempting to generalize on top of his changes. > > Yea, but i think, i just need to perform inflight migration only when some non-local migration flag is set. > I'm not sure about that. While it may be completely safe in your case to avoid GET_VRING_BASE/quiescing the backend, it could introduce corruption cases on other vhost-user-blk storage backends which work differently. I think this would require an "opt-in" migration parameter over and above checking that it's not a local migration. Ideally there would be some way to have the vhost-user backends expose support per device with a protocol feature and, if the source and destination both advertise support, we perform the inflight migration skipping GET_VRING_BASE. I don't know enough about the migration code to know how that would be done or if it is practical. Also (and I think this also applies to Vladimir's series), it should be possible to add multiple vhost-user-blk devices to the same Qemu instance but have the devices backed by very different backend implementations. What happens if some of those backend implementations support things like lightweight local migration and/or inflight migration but others don't? Ideally you would want the API to be able to select target vhost-user-blk devices for accelerated migration rather than assuming they will all want to use the same features. I don't know if anyone is doing that or if we care about such cases but I just want to throw the concern out there. > And i think, Vladimir's series already have that flag. > > > Thanks for the comments!
On 10/23/25 02:33, Raphael Norwitz wrote: > On Wed, Oct 22, 2025 at 3:59 AM Alexandr Moshkov > <dtalexundeer@yandex-team.ru> wrote: >> Hi! >> >> On 10/21/25 23:54, Raphael Norwitz wrote: >> >> The logic looks ok from the vhost-user-blk side but some comments inline. >> >> On Mon, Oct 20, 2025 at 1:47 AM Alexandr Moshkov >> <dtalexundeer@yandex-team.ru> wrote: >> >> Hi! >> >> During inter-host migration, waiting for disk requests to be drained >> in the vhost-user backend can incur significant downtime. >> >> This can be avoided if QEMU migrates the inflight region in vhost-user-blk. >> Thus, during the qemu migration, the vhost-user backend can cancel all inflight requests and >> then, after migration, they will be executed on another host. >> >> At first, I tried to implement migration for all vhost-user devices that support inflight at once, >> but this would require a lot of changes both in vhost-user-blk (to transfer it to the base class) and >> in the vhost-user-base base class (inflight implementation and remodeling + a large refactor). >> >> Even if it's a more significant change I'd rather generalize as much >> logic as possible and expose it as a vhost-user protocol feature. IMO >> too much vhost-user device-agnositic code is being pushed into >> vhost-user-blk. >> >> As far as I understand (correct me if I'm wrong), but this feature must be implemented in device itself (along with inflight field location) or in some base class for vhost-user devices. For now vhost-user-blk doesn't have one yet. The closest base class that i could find is vhost-user-base, but for using it, it has to implement all device-agnostic code from vhost-user-blk, and don't break all existing vhost-user-base derived devices. For example, to support inflight migration in base class, there first need to implement inflight field in it, along with the reconnect feature. Then, with a big refactor, somehow inherit vhost-user-blk from it, along with other devices such as vhost-user-fs. >> >> So, IMO it looks like a whole separate track that will need to be tackled. >> > I get that inflight is inside of struct VhostUserBlk, not in struct > vhost_user. Even if not all device types support inflight I think that > could probably be changed? If we do keep things as you have it here > patch 1 should probably bring the vhost_dev_load_inflight() and > vhost_dev_save_inflight() into vhost_user.c rather than vhost.c since > only vhost_user devices seem to use inflight. Firstly, I tried to implement inflight in the vhost_user structure, but this structure is initialized on every vhost_user_blk_connect() and then destroyed on every vhost_user_blk_disconnect(). But, the inflight region must survive device disconnect. During active migration, the vhost_user struct will be destroyed before the inflight saving. And you are right about the location of the functions in patch 1. I will move them to the vhost-user.c file, which will be more appropriate. Thanks! >> As Markus noted this also conflicts significantly with Vladimir's >> series so I'd suggest waiting until those are in, or possibly >> attempting to generalize on top of his changes. >> >> Yea, but i think, i just need to perform inflight migration only when some non-local migration flag is set. >> > I'm not sure about that. While it may be completely safe in your case > to avoid GET_VRING_BASE/quiescing the backend, it could introduce > corruption cases on other vhost-user-blk storage backends which work > differently. I think this would require an "opt-in" migration > parameter over and above checking that it's not a local migration. In patch 2 I introduce migration capability for that. Or you think that we need something like per-device parameter? > > Ideally there would be some way to have the vhost-user backends expose > support per device with a protocol feature and, if the source and > destination both advertise support, we perform the inflight migration > skipping GET_VRING_BASE. I don't know enough about the migration code > to know how that would be done or if it is practical. > > Also (and I think this also applies to Vladimir's series), it should > be possible to add multiple vhost-user-blk devices to the same Qemu > instance but have the devices backed by very different backend > implementations. What happens if some of those backend implementations > support things like lightweight local migration and/or inflight > migration but others don't? Ideally you would want the API to be able > to select target vhost-user-blk devices for accelerated migration > rather than assuming they will all want to use the same features. I > don't know if anyone is doing that or if we care about such cases but > I just want to throw the concern out there. Hmm, that looks like we need some per-device parameter in order to set this feature on for them. I guess we need a opinion from migration maintainer about such cases
On Thu, Oct 23, 2025 at 2:44 AM Alexandr Moshkov <dtalexundeer@yandex-team.ru> wrote: > > > On 10/23/25 02:33, Raphael Norwitz wrote: > > On Wed, Oct 22, 2025 at 3:59 AM Alexandr Moshkov > <dtalexundeer@yandex-team.ru> wrote: > > Hi! > > On 10/21/25 23:54, Raphael Norwitz wrote: > > The logic looks ok from the vhost-user-blk side but some comments inline. > > On Mon, Oct 20, 2025 at 1:47 AM Alexandr Moshkov > <dtalexundeer@yandex-team.ru> wrote: > > Hi! > > During inter-host migration, waiting for disk requests to be drained > in the vhost-user backend can incur significant downtime. > > This can be avoided if QEMU migrates the inflight region in vhost-user-blk. > Thus, during the qemu migration, the vhost-user backend can cancel all inflight requests and > then, after migration, they will be executed on another host. > > At first, I tried to implement migration for all vhost-user devices that support inflight at once, > but this would require a lot of changes both in vhost-user-blk (to transfer it to the base class) and > in the vhost-user-base base class (inflight implementation and remodeling + a large refactor). > > Even if it's a more significant change I'd rather generalize as much > logic as possible and expose it as a vhost-user protocol feature. IMO > too much vhost-user device-agnositic code is being pushed into > vhost-user-blk. > > As far as I understand (correct me if I'm wrong), but this feature must be implemented in device itself (along with inflight field location) or in some base class for vhost-user devices. For now vhost-user-blk doesn't have one yet. The closest base class that i could find is vhost-user-base, but for using it, it has to implement all device-agnostic code from vhost-user-blk, and don't break all existing vhost-user-base derived devices. For example, to support inflight migration in base class, there first need to implement inflight field in it, along with the reconnect feature. Then, with a big refactor, somehow inherit vhost-user-blk from it, along with other devices such as vhost-user-fs. > > So, IMO it looks like a whole separate track that will need to be tackled. > > I get that inflight is inside of struct VhostUserBlk, not in struct > vhost_user. Even if not all device types support inflight I think that > could probably be changed? If we do keep things as you have it here > patch 1 should probably bring the vhost_dev_load_inflight() and > vhost_dev_save_inflight() into vhost_user.c rather than vhost.c since > only vhost_user devices seem to use inflight. > > Firstly, I tried to implement inflight in the vhost_user structure, but this structure is initialized on every vhost_user_blk_connect() and then destroyed on every vhost_user_blk_disconnect(). But, the inflight region must survive device disconnect. > > During active migration, the vhost_user struct will be destroyed before the inflight saving. > In struct vhost we should probably add some data which persists between reconnects. I take your point that we cannot generalize inflight migration without doing that first. > And you are right about the location of the functions in patch 1. I will move them to the vhost-user.c file, which will be more appropriate. Thanks! > > As Markus noted this also conflicts significantly with Vladimir's > series so I'd suggest waiting until those are in, or possibly > attempting to generalize on top of his changes. > > Yea, but i think, i just need to perform inflight migration only when some non-local migration flag is set. > > I'm not sure about that. While it may be completely safe in your case > to avoid GET_VRING_BASE/quiescing the backend, it could introduce > corruption cases on other vhost-user-blk storage backends which work > differently. I think this would require an "opt-in" migration > parameter over and above checking that it's not a local migration. > > In patch 2 I introduce migration capability for that. Or you think that we need something like per-device parameter? > My point here was that you will definitely need a parameter you have. I'll leave it to others to say if we actually need a per-device parameter. > Ideally there would be some way to have the vhost-user backends expose > support per device with a protocol feature and, if the source and > destination both advertise support, we perform the inflight migration > skipping GET_VRING_BASE. I don't know enough about the migration code > to know how that would be done or if it is practical. > > Also (and I think this also applies to Vladimir's series), it should > be possible to add multiple vhost-user-blk devices to the same Qemu > instance but have the devices backed by very different backend > implementations. What happens if some of those backend implementations > support things like lightweight local migration and/or inflight > migration but others don't? Ideally you would want the API to be able > to select target vhost-user-blk devices for accelerated migration > rather than assuming they will all want to use the same features. I > don't know if anyone is doing that or if we care about such cases but > I just want to throw the concern out there. > > Hmm, that looks like we need some per-device parameter in order to set this feature on for them. > > I guess we need a opinion from migration maintainer about such cases
Alexandr Moshkov <dtalexundeer@yandex-team.ru> writes: > Hi! > > During inter-host migration, waiting for disk requests to be drained > in the vhost-user backend can incur significant downtime. > > This can be avoided if QEMU migrates the inflight region in vhost-user-blk. > Thus, during the qemu migration, the vhost-user backend can cancel all inflight requests and > then, after migration, they will be executed on another host. > > At first, I tried to implement migration for all vhost-user devices that support inflight at once, > but this would require a lot of changes both in vhost-user-blk (to transfer it to the base class) and > in the vhost-user-base base class (inflight implementation and remodeling + a large refactor). > > Therefore, for now I decided to leave this idea for later and > implement the migration of the inflight region first for vhost-user-blk. How is this work related to Vladimir's "vhost-user-blk: live-backend local migration"?
On 10/20/25 14:55, Markus Armbruster wrote: > Alexandr Moshkov<dtalexundeer@yandex-team.ru> writes: > >> Hi! >> >> During inter-host migration, waiting for disk requests to be drained >> in the vhost-user backend can incur significant downtime. >> >> This can be avoided if QEMU migrates the inflight region in vhost-user-blk. >> Thus, during the qemu migration, the vhost-user backend can cancel all inflight requests and >> then, after migration, they will be executed on another host. >> >> At first, I tried to implement migration for all vhost-user devices that support inflight at once, >> but this would require a lot of changes both in vhost-user-blk (to transfer it to the base class) and >> in the vhost-user-base base class (inflight implementation and remodeling + a large refactor). >> >> Therefore, for now I decided to leave this idea for later and >> implement the migration of the inflight region first for vhost-user-blk. > How is this work related to Vladimir's "vhost-user-blk: live-backend > local migration"? Hi! Vladimir's work is only related to local migration (including inflight region fd), when my patch allows to migrate inflight requests to another host to reduce downtime without waiting for disk requests to be drained.
© 2016 - 2025 Red Hat, Inc.