[v1] Extend vhost-user to support VFIO based accelerators

[Qemu-devel] [RFC 0/3] Extend vhost-user to support VFIO based accelerators

Posted by Tiwei Bie 8 years, 1 month ago

This RFC patch set does some small extensions to vhost-user protocol
to support VFIO based accelerators, and makes it possible to get the
similar performance of VFIO passthru while keeping the virtio device
emulation in QEMU.

When we have virtio ring compatible devices, it's possible to setup
the device (DMA mapping, PCI config, etc) based on the existing info
(memory-table, features, vring info, etc) which is available on the
vhost-backend (e.g. DPDK vhost library). Then, we will be able to
use such devices to accelerate the emulated device for the VM. And
we call it vDPA: vhost DataPath Acceleration. The key difference
between VFIO passthru and vDPA is that, in vDPA only the data path
(e.g. ring, notify and queue interrupt) is pass-throughed, the device
control path (e.g. PCI configuration space and MMIO regions) is still
defined and emulated by QEMU.

The benefits of keeping virtio device emulation in QEMU compared
with virtio device VFIO passthru include (but not limit to):

- consistent device interface from guest OS;
- max flexibility on control path and hardware design;
- leveraging the existing virtio live-migration framework;

But the critical issue in vDPA is that the data path performance is
relatively low and some host threads are needed for the data path,
because some necessary mechanisms are missing to support:

1) guest driver notifies the device directly;
2) device interrupts the guest directly;

So this patch set does some small extensions to vhost-user protocol
to make both of them possible. It leverages the same mechanisms (e.g.
EPT and Posted-Interrupt on Intel platform) as the VFIO passthru to
achieve the data path pass through.

A new protocol feature bit is added to negotiate the accelerator feature
support. Two new slave message types are added to enable the notify and
interrupt passthru for each queue. From the view of vhost-user protocol
design, it's very flexible. The passthru can be enabled/disabled for
each queue individually, and it's possible to accelerate each queue by
different devices. More design and implementation details can be found
from the last patch.

There are some rough edges in this patch set (so this is a RFC patch
set for now), but it's never too early to hear the thoughts from the
community! So any comments and suggestions would be really appreciated!

Tiwei Bie (3):
  vhost-user: support receiving file descriptors in slave_read
  vhost-user: introduce shared vhost-user state
  vhost-user: add VFIO based accelerators support

 docs/interop/vhost-user.txt     |  57 ++++++
 hw/scsi/vhost-user-scsi.c       |   6 +-
 hw/vfio/common.c                |   2 +-
 hw/virtio/vhost-user.c          | 430 +++++++++++++++++++++++++++++++++++++++-
 hw/virtio/vhost.c               |   3 +-
 hw/virtio/virtio-pci.c          |   8 -
 hw/virtio/virtio-pci.h          |   8 +
 include/hw/vfio/vfio.h          |   2 +
 include/hw/virtio/vhost-user.h  |  43 ++++
 include/hw/virtio/virtio-scsi.h |   6 +-
 net/vhost-user.c                |  30 +--
 11 files changed, 561 insertions(+), 34 deletions(-)
 create mode 100644 include/hw/virtio/vhost-user.h

-- 
2.13.3

Re: [Qemu-devel] [RFC 0/3] Extend vhost-user to support VFIO based accelerators

Posted by Alexey Kardashevskiy 8 years, 1 month ago

On 22/12/17 17:41, Tiwei Bie wrote:
> This RFC patch set does some small extensions to vhost-user protocol
> to support VFIO based accelerators, and makes it possible to get the
> similar performance of VFIO passthru while keeping the virtio device
> emulation in QEMU.
> 
> When we have virtio ring compatible devices, it's possible to setup
> the device (DMA mapping, PCI config, etc) based on the existing info
> (memory-table, features, vring info, etc) which is available on the
> vhost-backend (e.g. DPDK vhost library). Then, we will be able to
> use such devices to accelerate the emulated device for the VM. And
> we call it vDPA: vhost DataPath Acceleration. The key difference
> between VFIO passthru and vDPA is that, in vDPA only the data path
> (e.g. ring, notify and queue interrupt) is pass-throughed, the device
> control path (e.g. PCI configuration space and MMIO regions) is still
> defined and emulated by QEMU.
> 
> The benefits of keeping virtio device emulation in QEMU compared
> with virtio device VFIO passthru include (but not limit to):
> 
> - consistent device interface from guest OS;
> - max flexibility on control path and hardware design;
> - leveraging the existing virtio live-migration framework;
> 
> But the critical issue in vDPA is that the data path performance is
> relatively low and some host threads are needed for the data path,
> because some necessary mechanisms are missing to support:
> 
> 1) guest driver notifies the device directly;
> 2) device interrupts the guest directly;
> 
> So this patch set does some small extensions to vhost-user protocol
> to make both of them possible. It leverages the same mechanisms (e.g.
> EPT and Posted-Interrupt on Intel platform) as the VFIO passthru to
> achieve the data path pass through.
> 
> A new protocol feature bit is added to negotiate the accelerator feature
> support. Two new slave message types are added to enable the notify and
> interrupt passthru for each queue. From the view of vhost-user protocol
> design, it's very flexible. The passthru can be enabled/disabled for
> each queue individually, and it's possible to accelerate each queue by
> different devices. More design and implementation details can be found
> from the last patch.
> 
> There are some rough edges in this patch set (so this is a RFC patch
> set for now), but it's never too early to hear the thoughts from the
> community! So any comments and suggestions would be really appreciated!

I am missing a lot of context here. Out of curiosity - how is this all
supposed to work? QEMU command line example would be useful, what will the
guest see? A virtio device (i.e. Redhat vendor ID) or an actual PCI device
(since VFIO is mentioned)? Thanks.



-- 
Alexey

Re: [Qemu-devel] [RFC 0/3] Extend vhost-user to support VFIO based accelerators

Posted by Liang, Cunming 8 years, 1 month ago


> -----Original Message-----
> From: Alexey Kardashevskiy [mailto:aik@ozlabs.ru]
> Sent: Tuesday, January 2, 2018 10:42 AM
> To: Bie, Tiwei <tiwei.bie@intel.com>; virtio-dev@lists.oasis-open.org; qemu-
> devel@nongnu.org; mst@redhat.com; alex.williamson@redhat.com;
> pbonzini@redhat.com; stefanha@redhat.com
> Cc: Tan, Jianfeng <jianfeng.tan@intel.com>; Liang, Cunming
> <cunming.liang@intel.com>; Wang, Xiao W <xiao.w.wang@intel.com>; Wang,
> Zhihong <zhihong.wang@intel.com>; Daly, Dan <dan.daly@intel.com>
> Subject: Re: [Qemu-devel] [RFC 0/3] Extend vhost-user to support VFIO based
> accelerators
> 
> On 22/12/17 17:41, Tiwei Bie wrote:
> > This RFC patch set does some small extensions to vhost-user protocol
> > to support VFIO based accelerators, and makes it possible to get the
> > similar performance of VFIO passthru while keeping the virtio device
> > emulation in QEMU.
> >
> > When we have virtio ring compatible devices, it's possible to setup
> > the device (DMA mapping, PCI config, etc) based on the existing info
> > (memory-table, features, vring info, etc) which is available on the
> > vhost-backend (e.g. DPDK vhost library). Then, we will be able to use
> > such devices to accelerate the emulated device for the VM. And we call
> > it vDPA: vhost DataPath Acceleration. The key difference between VFIO
> > passthru and vDPA is that, in vDPA only the data path (e.g. ring,
> > notify and queue interrupt) is pass-throughed, the device control path
> > (e.g. PCI configuration space and MMIO regions) is still defined and
> > emulated by QEMU.
> >
> > The benefits of keeping virtio device emulation in QEMU compared with
> > virtio device VFIO passthru include (but not limit to):
> >
> > - consistent device interface from guest OS;
> > - max flexibility on control path and hardware design;
> > - leveraging the existing virtio live-migration framework;
> >
> > But the critical issue in vDPA is that the data path performance is
> > relatively low and some host threads are needed for the data path,
> > because some necessary mechanisms are missing to support:
> >
> > 1) guest driver notifies the device directly;
> > 2) device interrupts the guest directly;
> >
> > So this patch set does some small extensions to vhost-user protocol to
> > make both of them possible. It leverages the same mechanisms (e.g.
> > EPT and Posted-Interrupt on Intel platform) as the VFIO passthru to
> > achieve the data path pass through.
> >
> > A new protocol feature bit is added to negotiate the accelerator
> > feature support. Two new slave message types are added to enable the
> > notify and interrupt passthru for each queue. From the view of
> > vhost-user protocol design, it's very flexible. The passthru can be
> > enabled/disabled for each queue individually, and it's possible to
> > accelerate each queue by different devices. More design and
> > implementation details can be found from the last patch.
> >
> > There are some rough edges in this patch set (so this is a RFC patch
> > set for now), but it's never too early to hear the thoughts from the
> > community! So any comments and suggestions would be really appreciated!
> 
> I am missing a lot of context here. Out of curiosity - how is this all supposed to
> work? QEMU command line example would be useful, what will the guest see? A
> virtio device (i.e. Redhat vendor ID) or an actual PCI device (since VFIO is
> mentioned)? Thanks.

It's a normal virtio PCIe devices in the guest. Extensions on the host are transparent to the guest.

In terms of the usage, there's a sample may help.
http://dpdk.org/ml/archives/dev/2017-December/085044.html
The sample takes virtio-net device in VM as data path accelerator of virtio-net in nested VM.
When taking physical device on bare metal, it accelerates virtio-net in VM equivalently.
There's no additional params of QEMU command line needed for vhost-user.

One more context, including vDPA enabling in DPDK vhost-user library.
http://dpdk.org/ml/archives/dev/2017-December/084792.html

> 
> 
> 
> --
> Alexey

Re: [Qemu-devel] [RFC 0/3] Extend vhost-user to support VFIO based accelerators

Posted by Alexey Kardashevskiy 8 years, 1 month ago

On 02/01/18 16:49, Liang, Cunming wrote:
> 
> 
>> -----Original Message-----
>> From: Alexey Kardashevskiy [mailto:aik@ozlabs.ru]
>> Sent: Tuesday, January 2, 2018 10:42 AM
>> To: Bie, Tiwei <tiwei.bie@intel.com>; virtio-dev@lists.oasis-open.org; qemu-
>> devel@nongnu.org; mst@redhat.com; alex.williamson@redhat.com;
>> pbonzini@redhat.com; stefanha@redhat.com
>> Cc: Tan, Jianfeng <jianfeng.tan@intel.com>; Liang, Cunming
>> <cunming.liang@intel.com>; Wang, Xiao W <xiao.w.wang@intel.com>; Wang,
>> Zhihong <zhihong.wang@intel.com>; Daly, Dan <dan.daly@intel.com>
>> Subject: Re: [Qemu-devel] [RFC 0/3] Extend vhost-user to support VFIO based
>> accelerators
>>
>> On 22/12/17 17:41, Tiwei Bie wrote:
>>> This RFC patch set does some small extensions to vhost-user protocol
>>> to support VFIO based accelerators, and makes it possible to get the
>>> similar performance of VFIO passthru while keeping the virtio device
>>> emulation in QEMU.
>>>
>>> When we have virtio ring compatible devices, it's possible to setup
>>> the device (DMA mapping, PCI config, etc) based on the existing info
>>> (memory-table, features, vring info, etc) which is available on the
>>> vhost-backend (e.g. DPDK vhost library). Then, we will be able to use
>>> such devices to accelerate the emulated device for the VM. And we call
>>> it vDPA: vhost DataPath Acceleration. The key difference between VFIO
>>> passthru and vDPA is that, in vDPA only the data path (e.g. ring,
>>> notify and queue interrupt) is pass-throughed, the device control path
>>> (e.g. PCI configuration space and MMIO regions) is still defined and
>>> emulated by QEMU.
>>>
>>> The benefits of keeping virtio device emulation in QEMU compared with
>>> virtio device VFIO passthru include (but not limit to):
>>>
>>> - consistent device interface from guest OS;
>>> - max flexibility on control path and hardware design;
>>> - leveraging the existing virtio live-migration framework;
>>>
>>> But the critical issue in vDPA is that the data path performance is
>>> relatively low and some host threads are needed for the data path,
>>> because some necessary mechanisms are missing to support:
>>>
>>> 1) guest driver notifies the device directly;
>>> 2) device interrupts the guest directly;
>>>
>>> So this patch set does some small extensions to vhost-user protocol to
>>> make both of them possible. It leverages the same mechanisms (e.g.
>>> EPT and Posted-Interrupt on Intel platform) as the VFIO passthru to
>>> achieve the data path pass through.
>>>
>>> A new protocol feature bit is added to negotiate the accelerator
>>> feature support. Two new slave message types are added to enable the
>>> notify and interrupt passthru for each queue. From the view of
>>> vhost-user protocol design, it's very flexible. The passthru can be
>>> enabled/disabled for each queue individually, and it's possible to
>>> accelerate each queue by different devices. More design and
>>> implementation details can be found from the last patch.
>>>
>>> There are some rough edges in this patch set (so this is a RFC patch
>>> set for now), but it's never too early to hear the thoughts from the
>>> community! So any comments and suggestions would be really appreciated!
>>
>> I am missing a lot of context here. Out of curiosity - how is this all supposed to
>> work? QEMU command line example would be useful, what will the guest see? A
>> virtio device (i.e. Redhat vendor ID) or an actual PCI device (since VFIO is
>> mentioned)? Thanks.
> 
> It's a normal virtio PCIe devices in the guest. Extensions on the host are transparent to the guest.
> 
> In terms of the usage, there's a sample may help.
> http://dpdk.org/ml/archives/dev/2017-December/085044.html
> The sample takes virtio-net device in VM as data path accelerator of virtio-net in nested VM.


Aaah, this is for nested VMs, the original description was not clear about
this. I get it now, thanks.


> When taking physical device on bare metal, it accelerates virtio-net in VM equivalently.
> There's no additional params of QEMU command line needed for vhost-user.
> 
> One more context, including vDPA enabling in DPDK vhost-user library.
> http://dpdk.org/ml/archives/dev/2017-December/084792.html



-- 
Alexey

Re: [Qemu-devel] [RFC 0/3] Extend vhost-user to support VFIO based accelerators

Posted by Liang, Cunming 8 years, 1 month ago


> -----Original Message-----
> From: Alexey Kardashevskiy [mailto:aik@ozlabs.ru]
> Sent: Tuesday, January 2, 2018 2:01 PM
> To: Liang, Cunming <cunming.liang@intel.com>; Bie, Tiwei <tiwei.bie@intel.com>;
> virtio-dev@lists.oasis-open.org; qemu-devel@nongnu.org; mst@redhat.com;
> alex.williamson@redhat.com; pbonzini@redhat.com; stefanha@redhat.com
> Cc: Tan, Jianfeng <jianfeng.tan@intel.com>; Wang, Xiao W
> <xiao.w.wang@intel.com>; Wang, Zhihong <zhihong.wang@intel.com>; Daly,
> Dan <dan.daly@intel.com>
> Subject: Re: [Qemu-devel] [RFC 0/3] Extend vhost-user to support VFIO based
> accelerators
> 
> On 02/01/18 16:49, Liang, Cunming wrote:
> >
> >
> >> -----Original Message-----
> >> From: Alexey Kardashevskiy [mailto:aik@ozlabs.ru]
> >> Sent: Tuesday, January 2, 2018 10:42 AM
> >> To: Bie, Tiwei <tiwei.bie@intel.com>;
> >> virtio-dev@lists.oasis-open.org; qemu- devel@nongnu.org;
> >> mst@redhat.com; alex.williamson@redhat.com; pbonzini@redhat.com;
> >> stefanha@redhat.com
> >> Cc: Tan, Jianfeng <jianfeng.tan@intel.com>; Liang, Cunming
> >> <cunming.liang@intel.com>; Wang, Xiao W <xiao.w.wang@intel.com>;
> >> Wang, Zhihong <zhihong.wang@intel.com>; Daly, Dan
> >> <dan.daly@intel.com>
> >> Subject: Re: [Qemu-devel] [RFC 0/3] Extend vhost-user to support VFIO
> >> based accelerators
> >>
> >> On 22/12/17 17:41, Tiwei Bie wrote:
> >>> This RFC patch set does some small extensions to vhost-user protocol
> >>> to support VFIO based accelerators, and makes it possible to get the
> >>> similar performance of VFIO passthru while keeping the virtio device
> >>> emulation in QEMU.
> >>>
> >>> When we have virtio ring compatible devices, it's possible to setup
> >>> the device (DMA mapping, PCI config, etc) based on the existing info
> >>> (memory-table, features, vring info, etc) which is available on the
> >>> vhost-backend (e.g. DPDK vhost library). Then, we will be able to
> >>> use such devices to accelerate the emulated device for the VM. And
> >>> we call it vDPA: vhost DataPath Acceleration. The key difference
> >>> between VFIO passthru and vDPA is that, in vDPA only the data path
> >>> (e.g. ring, notify and queue interrupt) is pass-throughed, the
> >>> device control path (e.g. PCI configuration space and MMIO regions)
> >>> is still defined and emulated by QEMU.
> >>>
> >>> The benefits of keeping virtio device emulation in QEMU compared
> >>> with virtio device VFIO passthru include (but not limit to):
> >>>
> >>> - consistent device interface from guest OS;
> >>> - max flexibility on control path and hardware design;
> >>> - leveraging the existing virtio live-migration framework;
> >>>
> >>> But the critical issue in vDPA is that the data path performance is
> >>> relatively low and some host threads are needed for the data path,
> >>> because some necessary mechanisms are missing to support:
> >>>
> >>> 1) guest driver notifies the device directly;
> >>> 2) device interrupts the guest directly;
> >>>
> >>> So this patch set does some small extensions to vhost-user protocol
> >>> to make both of them possible. It leverages the same mechanisms (e.g.
> >>> EPT and Posted-Interrupt on Intel platform) as the VFIO passthru to
> >>> achieve the data path pass through.
> >>>
> >>> A new protocol feature bit is added to negotiate the accelerator
> >>> feature support. Two new slave message types are added to enable the
> >>> notify and interrupt passthru for each queue. From the view of
> >>> vhost-user protocol design, it's very flexible. The passthru can be
> >>> enabled/disabled for each queue individually, and it's possible to
> >>> accelerate each queue by different devices. More design and
> >>> implementation details can be found from the last patch.
> >>>
> >>> There are some rough edges in this patch set (so this is a RFC patch
> >>> set for now), but it's never too early to hear the thoughts from the
> >>> community! So any comments and suggestions would be really appreciated!
> >>
> >> I am missing a lot of context here. Out of curiosity - how is this
> >> all supposed to work? QEMU command line example would be useful, what
> >> will the guest see? A virtio device (i.e. Redhat vendor ID) or an
> >> actual PCI device (since VFIO is mentioned)? Thanks.
> >
> > It's a normal virtio PCIe devices in the guest. Extensions on the host are
> transparent to the guest.
> >
> > In terms of the usage, there's a sample may help.
> > http://dpdk.org/ml/archives/dev/2017-December/085044.html
> > The sample takes virtio-net device in VM as data path accelerator of virtio-net
> in nested VM.
> 
> 
> Aaah, this is for nested VMs, the original description was not clear about this. I
> get it now, thanks.

BTW, the patch is not only used for nested VM, even the sample is.
Once you get a virtio compatible device, it's helpful to normal VM too.
Basically, it gives extra ability of para-virtualized device to associate with an accelerator who can talk with the guest PV device driver directly.

> 
> 
> > When taking physical device on bare metal, it accelerates virtio-net in VM
> equivalently.
> > There's no additional params of QEMU command line needed for vhost-user.
> >
> > One more context, including vDPA enabling in DPDK vhost-user library.
> > http://dpdk.org/ml/archives/dev/2017-December/084792.html
> 
> 
> 
> --
> Alexey