drivers/vdpa/mlx5/net/mlx5_vnet.c | 3 +- drivers/vdpa/vdpa_sim/vdpa_sim.c | 24 ++++++- drivers/vdpa/vdpa_user/vduse_dev.c | 32 +++++++++ drivers/vhost/vdpa.c | 101 +++++++++++++++++++++++++++-- drivers/vhost/vhost.c | 15 +++++ drivers/vhost/vhost.h | 1 + include/uapi/linux/vhost.h | 10 +++ include/uapi/linux/vhost_types.h | 15 ++++- 8 files changed, 191 insertions(+), 10 deletions(-)
Live update is a technique wherein an application saves its state, exec's
to an updated version of itself, and restores its state. Clients of the
application experience a brief suspension of service, on the order of
100's of milliseconds, but are otherwise unaffected.
Define and implement interfaces that allow vdpa devices to be preserved
across fork or exec, to support live update for applications such as qemu.
The device must be suspended during the update, but its dma mappings are
preserved, so the suspension is brief.
The VHOST_NEW_OWNER ioctl transfers device ownership and pinned memory
accounting from one process to another.
The VHOST_BACKEND_F_NEW_OWNER backend capability indicates that
VHOST_NEW_OWNER is supported.
The VHOST_IOTLB_REMAP message type updates a dma mapping with its userland
address in the new process.
The VHOST_BACKEND_F_IOTLB_REMAP backend capability indicates that
VHOST_IOTLB_REMAP is supported and required. Some devices do not
require it, because the userland address of each dma mapping is discarded
after being translated to a physical address.
Here is a pseudo-code sequence for performing live update, based on
suspend + reset because resume is not yet available. The vdpa device
descriptor, fd, remains open across the exec.
ioctl(fd, VHOST_VDPA_SUSPEND)
ioctl(fd, VHOST_VDPA_SET_STATUS, 0)
exec
ioctl(fd, VHOST_NEW_OWNER)
issue ioctls to re-create vrings
if VHOST_BACKEND_F_IOTLB_REMAP
foreach dma mapping
write(fd, {VHOST_IOTLB_REMAP, new_addr})
ioctl(fd, VHOST_VDPA_SET_STATUS,
ACKNOWLEDGE | DRIVER | FEATURES_OK | DRIVER_OK)
Steve Sistare (13):
vhost-vdpa: count pinned memory
vhost-vdpa: pass mm to bind
vhost-vdpa: VHOST_NEW_OWNER
vhost-vdpa: VHOST_BACKEND_F_NEW_OWNER
vhost-vdpa: VHOST_IOTLB_REMAP
vhost-vdpa: VHOST_BACKEND_F_IOTLB_REMAP
vhost-vdpa: flush workers on suspend
vduse: flush workers on suspend
vdpa_sim: reset must not run
vdpa_sim: flush workers on suspend
vdpa/mlx5: new owner capability
vdpa_sim: new owner capability
vduse: new owner capability
drivers/vdpa/mlx5/net/mlx5_vnet.c | 3 +-
drivers/vdpa/vdpa_sim/vdpa_sim.c | 24 ++++++-
drivers/vdpa/vdpa_user/vduse_dev.c | 32 +++++++++
drivers/vhost/vdpa.c | 101 +++++++++++++++++++++++++++--
drivers/vhost/vhost.c | 15 +++++
drivers/vhost/vhost.h | 1 +
include/uapi/linux/vhost.h | 10 +++
include/uapi/linux/vhost_types.h | 15 ++++-
8 files changed, 191 insertions(+), 10 deletions(-)
--
2.39.3
On Thu, Jan 11, 2024 at 4:40 AM Steve Sistare <steven.sistare@oracle.com> wrote:
>
> Live update is a technique wherein an application saves its state, exec's
> to an updated version of itself, and restores its state. Clients of the
> application experience a brief suspension of service, on the order of
> 100's of milliseconds, but are otherwise unaffected.
>
> Define and implement interfaces that allow vdpa devices to be preserved
> across fork or exec, to support live update for applications such as qemu.
> The device must be suspended during the update, but its dma mappings are
> preserved, so the suspension is brief.
>
> The VHOST_NEW_OWNER ioctl transfers device ownership and pinned memory
> accounting from one process to another.
>
> The VHOST_BACKEND_F_NEW_OWNER backend capability indicates that
> VHOST_NEW_OWNER is supported.
>
> The VHOST_IOTLB_REMAP message type updates a dma mapping with its userland
> address in the new process.
>
> The VHOST_BACKEND_F_IOTLB_REMAP backend capability indicates that
> VHOST_IOTLB_REMAP is supported and required. Some devices do not
> require it, because the userland address of each dma mapping is discarded
> after being translated to a physical address.
>
> Here is a pseudo-code sequence for performing live update, based on
> suspend + reset because resume is not yet available. The vdpa device
> descriptor, fd, remains open across the exec.
>
> ioctl(fd, VHOST_VDPA_SUSPEND)
> ioctl(fd, VHOST_VDPA_SET_STATUS, 0)
> exec
Is there a userspace implementation as a reference?
>
> ioctl(fd, VHOST_NEW_OWNER)
>
> issue ioctls to re-create vrings
>
> if VHOST_BACKEND_F_IOTLB_REMAP
> foreach dma mapping
> write(fd, {VHOST_IOTLB_REMAP, new_addr})
I think I need to understand the advantages of this approach. For
example, why it is better than
ioctl(VHOST_RESET_OWNER)
exec
ioctl(VHOST_SET_OWNER)
for each dma mapping
ioctl(VHOST_IOTLB_UPDATE)
Thanks
>
> ioctl(fd, VHOST_VDPA_SET_STATUS,
> ACKNOWLEDGE | DRIVER | FEATURES_OK | DRIVER_OK)
>
>
> Steve Sistare (13):
> vhost-vdpa: count pinned memory
> vhost-vdpa: pass mm to bind
> vhost-vdpa: VHOST_NEW_OWNER
> vhost-vdpa: VHOST_BACKEND_F_NEW_OWNER
> vhost-vdpa: VHOST_IOTLB_REMAP
> vhost-vdpa: VHOST_BACKEND_F_IOTLB_REMAP
> vhost-vdpa: flush workers on suspend
> vduse: flush workers on suspend
> vdpa_sim: reset must not run
> vdpa_sim: flush workers on suspend
> vdpa/mlx5: new owner capability
> vdpa_sim: new owner capability
> vduse: new owner capability
>
> drivers/vdpa/mlx5/net/mlx5_vnet.c | 3 +-
> drivers/vdpa/vdpa_sim/vdpa_sim.c | 24 ++++++-
> drivers/vdpa/vdpa_user/vduse_dev.c | 32 +++++++++
> drivers/vhost/vdpa.c | 101 +++++++++++++++++++++++++++--
> drivers/vhost/vhost.c | 15 +++++
> drivers/vhost/vhost.h | 1 +
> include/uapi/linux/vhost.h | 10 +++
> include/uapi/linux/vhost_types.h | 15 ++++-
> 8 files changed, 191 insertions(+), 10 deletions(-)
>
> --
> 2.39.3
>
On 1/10/2024 9:55 PM, Jason Wang wrote:
> On Thu, Jan 11, 2024 at 4:40 AM Steve Sistare <steven.sistare@oracle.com> wrote:
>>
>> Live update is a technique wherein an application saves its state, exec's
>> to an updated version of itself, and restores its state. Clients of the
>> application experience a brief suspension of service, on the order of
>> 100's of milliseconds, but are otherwise unaffected.
>>
>> Define and implement interfaces that allow vdpa devices to be preserved
>> across fork or exec, to support live update for applications such as qemu.
>> The device must be suspended during the update, but its dma mappings are
>> preserved, so the suspension is brief.
>>
>> The VHOST_NEW_OWNER ioctl transfers device ownership and pinned memory
>> accounting from one process to another.
>>
>> The VHOST_BACKEND_F_NEW_OWNER backend capability indicates that
>> VHOST_NEW_OWNER is supported.
>>
>> The VHOST_IOTLB_REMAP message type updates a dma mapping with its userland
>> address in the new process.
>>
>> The VHOST_BACKEND_F_IOTLB_REMAP backend capability indicates that
>> VHOST_IOTLB_REMAP is supported and required. Some devices do not
>> require it, because the userland address of each dma mapping is discarded
>> after being translated to a physical address.
>>
>> Here is a pseudo-code sequence for performing live update, based on
>> suspend + reset because resume is not yet available. The vdpa device
>> descriptor, fd, remains open across the exec.
>>
>> ioctl(fd, VHOST_VDPA_SUSPEND)
>> ioctl(fd, VHOST_VDPA_SET_STATUS, 0)
>> exec
>
> Is there a userspace implementation as a reference?
I have working patches for qemu that use these ioctl's, but they depend on other
qemu cpr patches that are a work in progress, and not posted yet. I'm working on
that.
>> ioctl(fd, VHOST_NEW_OWNER)
>>
>> issue ioctls to re-create vrings
>>
>> if VHOST_BACKEND_F_IOTLB_REMAP
>> foreach dma mapping
>> write(fd, {VHOST_IOTLB_REMAP, new_addr})
>
> I think I need to understand the advantages of this approach. For
> example, why it is better than
>
> ioctl(VHOST_RESET_OWNER)
> exec
>
> ioctl(VHOST_SET_OWNER)
>
> for each dma mapping
> ioctl(VHOST_IOTLB_UPDATE)
That is slower. VHOST_RESET_OWNER unbinds physical pages, and VHOST_IOTLB_UPDATE
rebinds them. It costs multiple seconds for large memories, and is incurred during the
virtual machine's pause time during live update. For comparison, the total pause time
for live update with vfio interfaces is ~100 millis.
However, the interaction with userland is so similar that the same code paths can be used.
In my qemu prototype, after cpr exec's new qemu:
- vhost_vdpa_set_owner() calls VHOST_NEW_OWNER instead of VHOST_SET_OWNER
- vhost_vdpa_dma_map() sets type VHOST_IOTLB_REMAP instead of VHOST_IOTLB_UPDATE
- Steve
On Thu, Jan 18, 2024 at 4:32 AM Steven Sistare
<steven.sistare@oracle.com> wrote:
>
> On 1/10/2024 9:55 PM, Jason Wang wrote:
> > On Thu, Jan 11, 2024 at 4:40 AM Steve Sistare <steven.sistare@oracle.com> wrote:
> >>
> >> Live update is a technique wherein an application saves its state, exec's
> >> to an updated version of itself, and restores its state. Clients of the
> >> application experience a brief suspension of service, on the order of
> >> 100's of milliseconds, but are otherwise unaffected.
> >>
> >> Define and implement interfaces that allow vdpa devices to be preserved
> >> across fork or exec, to support live update for applications such as qemu.
> >> The device must be suspended during the update, but its dma mappings are
> >> preserved, so the suspension is brief.
> >>
> >> The VHOST_NEW_OWNER ioctl transfers device ownership and pinned memory
> >> accounting from one process to another.
> >>
> >> The VHOST_BACKEND_F_NEW_OWNER backend capability indicates that
> >> VHOST_NEW_OWNER is supported.
> >>
> >> The VHOST_IOTLB_REMAP message type updates a dma mapping with its userland
> >> address in the new process.
> >>
> >> The VHOST_BACKEND_F_IOTLB_REMAP backend capability indicates that
> >> VHOST_IOTLB_REMAP is supported and required. Some devices do not
> >> require it, because the userland address of each dma mapping is discarded
> >> after being translated to a physical address.
> >>
> >> Here is a pseudo-code sequence for performing live update, based on
> >> suspend + reset because resume is not yet available. The vdpa device
> >> descriptor, fd, remains open across the exec.
> >>
> >> ioctl(fd, VHOST_VDPA_SUSPEND)
> >> ioctl(fd, VHOST_VDPA_SET_STATUS, 0)
> >> exec
> >
> > Is there a userspace implementation as a reference?
>
> I have working patches for qemu that use these ioctl's, but they depend on other
> qemu cpr patches that are a work in progress, and not posted yet. I'm working on
> that.
Ok.
>
> >> ioctl(fd, VHOST_NEW_OWNER)
> >>
> >> issue ioctls to re-create vrings
> >>
> >> if VHOST_BACKEND_F_IOTLB_REMAP
> >> foreach dma mapping
> >> write(fd, {VHOST_IOTLB_REMAP, new_addr})
> >
> > I think I need to understand the advantages of this approach. For
> > example, why it is better than
> >
> > ioctl(VHOST_RESET_OWNER)
> > exec
> >
> > ioctl(VHOST_SET_OWNER)
> >
> > for each dma mapping
> > ioctl(VHOST_IOTLB_UPDATE)
>
> That is slower. VHOST_RESET_OWNER unbinds physical pages, and VHOST_IOTLB_UPDATE
> rebinds them. It costs multiple seconds for large memories, and is incurred during the
> virtual machine's pause time during live update. For comparison, the total pause time
> for live update with vfio interfaces is ~100 millis.
>
> However, the interaction with userland is so similar that the same code paths can be used.
> In my qemu prototype, after cpr exec's new qemu:
> - vhost_vdpa_set_owner() calls VHOST_NEW_OWNER instead of VHOST_SET_OWNER
> - vhost_vdpa_dma_map() sets type VHOST_IOTLB_REMAP instead of VHOST_IOTLB_UPDATE
>
> - Steve
>
Ok, let's document this in the changlog.
Thanks
© 2016 - 2025 Red Hat, Inc.