[PATCH 0/6] VDUSE: Support registering userspace memory as bounce buffer

Xie Yongji posted 6 patches 3 years, 9 months ago
There is a newer version of this series
drivers/vdpa/vdpa_user/iova_domain.c | 134 +++++++++++++++++++---
drivers/vdpa/vdpa_user/iova_domain.h |   9 ++
drivers/vdpa/vdpa_user/vduse_dev.c   | 163 +++++++++++++++++++++++++++
include/uapi/linux/vduse.h           |  53 ++++++++-
4 files changed, 345 insertions(+), 14 deletions(-)
[PATCH 0/6] VDUSE: Support registering userspace memory as bounce buffer
Posted by Xie Yongji 3 years, 9 months ago
Hi all,

This series introduces some new ioctls: VDUSE_IOTLB_GET_INFO,
VDUSE_IOTLB_REG_UMEM and VDUSE_IOTLB_DEREG_UMEM to support
registering and de-registering userspace memory for IOTLB
as bounce buffer in virtio-vdpa case.

The VDUSE_IOTLB_GET_INFO ioctl can help user to query IOLTB
information such as bounce buffer size. Then user can use
those information on VDUSE_IOTLB_REG_UMEM and
VDUSE_IOTLB_DEREG_UMEM ioctls to register and de-register
userspace memory for IOTLB.

During registering and de-registering, the DMA data in use
would be copied from kernel bounce pages to userspace bounce
pages and back.

With this feature, some existing application such as SPDK
and DPDK can leverage the datapath of VDUSE directly and
efficiently as discussed before [1]. They can register some
preallocated hugepages to VDUSE to avoid an extra memcpy
from bounce-buffer to hugepages.

The kernel and userspace codes could be found in github:

https://github.com/bytedance/linux/tree/vduse-umem
https://github.com/bytedance/qemu/tree/vduse-umem

To test it with qemu-storage-daemon:

$ qemu-storage-daemon \
    --chardev socket,id=charmonitor,path=/tmp/qmp.sock,server=on,wait=off \
    --monitor chardev=charmonitor \
    --blockdev driver=host_device,cache.direct=on,aio=native,filename=/dev/nullb0,node-name=disk0 \
    --export type=vduse-blk,id=vduse-test,name=vduse-test,node-name=disk0,writable=on

[1] https://lkml.org/lkml/2021/6/27/318

Please review, thanks!

Xie Yongji (6):
  vduse: Remove unnecessary spin lock protection
  vduse: Use memcpy_{to,from}_page() in do_bounce()
  vduse: Support using userspace pages as bounce buffer
  vduse: Support querying IOLTB information
  vduse: Support registering userspace memory for IOTLB
  vduse: Update api version to 1

 drivers/vdpa/vdpa_user/iova_domain.c | 134 +++++++++++++++++++---
 drivers/vdpa/vdpa_user/iova_domain.h |   9 ++
 drivers/vdpa/vdpa_user/vduse_dev.c   | 163 +++++++++++++++++++++++++++
 include/uapi/linux/vduse.h           |  53 ++++++++-
 4 files changed, 345 insertions(+), 14 deletions(-)

-- 
2.20.1
Re: [PATCH 0/6] VDUSE: Support registering userspace memory as bounce buffer
Posted by Liu Xiaodong 3 years, 9 months ago
On Wed, Jun 29, 2022 at 04:25:35PM +0800, Xie Yongji wrote:
> Hi all,
> 
> This series introduces some new ioctls: VDUSE_IOTLB_GET_INFO,
> VDUSE_IOTLB_REG_UMEM and VDUSE_IOTLB_DEREG_UMEM to support
> registering and de-registering userspace memory for IOTLB
> as bounce buffer in virtio-vdpa case.
> 
> The VDUSE_IOTLB_GET_INFO ioctl can help user to query IOLTB
> information such as bounce buffer size. Then user can use
> those information on VDUSE_IOTLB_REG_UMEM and
> VDUSE_IOTLB_DEREG_UMEM ioctls to register and de-register
> userspace memory for IOTLB.
> 
> During registering and de-registering, the DMA data in use
> would be copied from kernel bounce pages to userspace bounce
> pages and back.
> 
> With this feature, some existing application such as SPDK
> and DPDK can leverage the datapath of VDUSE directly and
> efficiently as discussed before [1]. They can register some
> preallocated hugepages to VDUSE to avoid an extra memcpy
> from bounce-buffer to hugepages.

Hi, Yongji

Very glad to see this enhancement in VDUSE. Thank you.
It is really helpful and essential to SPDK.
With this new feature, we can get VDUSE transferred data
accessed directly by userspace physical backends, like RDMA
and PCIe devices.

In SPDK roadmap, it's one important work to export block
services to local host, especially for container scenario.
This patch could help SPDK do that with its userspace
backend stacks while keeping high efficiency and performance.
So the whole SPDK ecosystem can get benefited.

Based on this enhancement, as discussed, I drafted a VDUSE
prototype module in SPDK for initial evaluation:
[TEST]vduse: prototype for initial draft
https://review.spdk.io/gerrit/c/spdk/spdk/+/13534

Running SPDK on single CPU core, configured with 2 P3700 NVMe,
and exported block devices to local host kernel via different
protocols. The randwrite IOPS through each protocol are:
NBD 		  121K
NVMf-tcp loopback 274K
VDUSE 		  463K

SPDK with RDMA backends should have a similar ratio.
VDUSE has a great performance advantage for SPDK.
We have kept investigating on this usage for years.
Originally, some SPDK users used NBD. Then NVMf-tcp loopback
is SPDK community accommended way. In future, VDUSE could be
the preferred way.

> The kernel and userspace codes could be found in github:
> 
> https://github.com/bytedance/linux/tree/vduse-umem
> https://github.com/bytedance/qemu/tree/vduse-umem
> 
> To test it with qemu-storage-daemon:
> 
> $ qemu-storage-daemon \
>     --chardev socket,id=charmonitor,path=/tmp/qmp.sock,server=on,wait=off \
>     --monitor chardev=charmonitor \
>     --blockdev driver=host_device,cache.direct=on,aio=native,filename=/dev/nullb0,node-name=disk0
> \
>     --export type=vduse-blk,id=vduse-test,name=vduse-test,node-name=disk0,writable=on
> 
> [1] https://lkml.org/lkml/2021/6/27/318
> 
> Please review, thanks!

Waiting for its review process.

Thanks
Xiaodong
Re: [PATCH 0/6] VDUSE: Support registering userspace memory as bounce buffer
Posted by Yongji Xie 3 years, 9 months ago
Hi Xiaodong,

On Mon, Jul 4, 2022 at 5:27 PM Liu Xiaodong <xiaodong.liu@intel.com> wrote:
>
> On Wed, Jun 29, 2022 at 04:25:35PM +0800, Xie Yongji wrote:
> > Hi all,
> >
> > This series introduces some new ioctls: VDUSE_IOTLB_GET_INFO,
> > VDUSE_IOTLB_REG_UMEM and VDUSE_IOTLB_DEREG_UMEM to support
> > registering and de-registering userspace memory for IOTLB
> > as bounce buffer in virtio-vdpa case.
> >
> > The VDUSE_IOTLB_GET_INFO ioctl can help user to query IOLTB
> > information such as bounce buffer size. Then user can use
> > those information on VDUSE_IOTLB_REG_UMEM and
> > VDUSE_IOTLB_DEREG_UMEM ioctls to register and de-register
> > userspace memory for IOTLB.
> >
> > During registering and de-registering, the DMA data in use
> > would be copied from kernel bounce pages to userspace bounce
> > pages and back.
> >
> > With this feature, some existing application such as SPDK
> > and DPDK can leverage the datapath of VDUSE directly and
> > efficiently as discussed before [1]. They can register some
> > preallocated hugepages to VDUSE to avoid an extra memcpy
> > from bounce-buffer to hugepages.
>
> Hi, Yongji
>
> Very glad to see this enhancement in VDUSE. Thank you.
> It is really helpful and essential to SPDK.
> With this new feature, we can get VDUSE transferred data
> accessed directly by userspace physical backends, like RDMA
> and PCIe devices.
>
> In SPDK roadmap, it's one important work to export block
> services to local host, especially for container scenario.
> This patch could help SPDK do that with its userspace
> backend stacks while keeping high efficiency and performance.
> So the whole SPDK ecosystem can get benefited.
>
> Based on this enhancement, as discussed, I drafted a VDUSE
> prototype module in SPDK for initial evaluation:
> [TEST]vduse: prototype for initial draft
> https://review.spdk.io/gerrit/c/spdk/spdk/+/13534
>

Thanks for this nice work!

> Running SPDK on single CPU core, configured with 2 P3700 NVMe,
> and exported block devices to local host kernel via different
> protocols. The randwrite IOPS through each protocol are:
> NBD               121K
> NVMf-tcp loopback 274K
> VDUSE             463K
>
> SPDK with RDMA backends should have a similar ratio.
> VDUSE has a great performance advantage for SPDK.
> We have kept investigating on this usage for years.
> Originally, some SPDK users used NBD. Then NVMf-tcp loopback
> is SPDK community accommended way. In future, VDUSE could be
> the preferred way.
>

Glad to see SPDK can benefit from this feature. I will continue to
improve this feature to make it available ASAP.

Thanks,
Yongji

> > The kernel and userspace codes could be found in github:
> >
> > https://github.com/bytedance/linux/tree/vduse-umem
> > https://github.com/bytedance/qemu/tree/vduse-umem
> >
> > To test it with qemu-storage-daemon:
> >
> > $ qemu-storage-daemon \
> >     --chardev socket,id=charmonitor,path=/tmp/qmp.sock,server=on,wait=off \
> >     --monitor chardev=charmonitor \
> >     --blockdev driver=host_device,cache.direct=on,aio=native,filename=/dev/nullb0,node-name=disk0
> > \
> >     --export type=vduse-blk,id=vduse-test,name=vduse-test,node-name=disk0,writable=on
> >
> > [1] https://lkml.org/lkml/2021/6/27/318
> >
> > Please review, thanks!
>
> Waiting for its review process.
>
> Thanks
> Xiaodong