[Qemu-devel] [RFC PATCH 0/7] virtio-fs: shared file system for virtual machines3

Dr. David Alan Gilbert (git) posted 7 patches 5 years, 4 months ago
Test checkpatch passed
Test docker-quick@centos7 passed
Test docker-clang@ubuntu passed
Test docker-mingw@fedora failed
Test asan passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20181210173151.16629-1-dgilbert@redhat.com
configure                                   |  10 +
contrib/libvhost-user/libvhost-user.h       |   3 +
docs/interop/vhost-user.txt                 |  35 ++
hw/virtio/Makefile.objs                     |   1 +
hw/virtio/vhost-user-fs.c                   | 517 ++++++++++++++++++++
hw/virtio/vhost-user.c                      |  16 +
hw/virtio/virtio-pci.c                      | 115 +++++
hw/virtio/virtio-pci.h                      |  19 +
include/hw/pci/pci.h                        |   1 +
include/hw/virtio/vhost-user-fs.h           |  79 +++
include/standard-headers/linux/virtio_fs.h  |  48 ++
include/standard-headers/linux/virtio_ids.h |   1 +
include/standard-headers/linux/virtio_pci.h |   9 +
13 files changed, 854 insertions(+)
create mode 100644 hw/virtio/vhost-user-fs.c
create mode 100644 include/hw/virtio/vhost-user-fs.h
create mode 100644 include/standard-headers/linux/virtio_fs.h
[Qemu-devel] [RFC PATCH 0/7] virtio-fs: shared file system for virtual machines3
Posted by Dr. David Alan Gilbert (git) 5 years, 4 months ago
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

Hi,
  This is the first RFC for the QEMU side of 'virtio-fs';
a new mechanism for mounting host directories into the guest
in a fast, consistent and secure manner.  Our primary use
case is kata containers, but it should be usable in other scenarios
as well.

There are corresponding patches being posted to Linux kernel,
libfuse and kata lists.

For a fuller design description, and benchmark numbers, please see
Vivek's posting of the kernel set here:

https://marc.info/?l=linux-kernel&m=154446243024251&w=2

We've got a small website with instructions on how to use it, here:

https://virtio-fs.gitlab.io/

and all the code is available on gitlab at:

https://gitlab.com/virtio-fs

QEMU's changes
--------------

The QEMU changes are pretty small; 

There's a new vhost-user device, which is used to carry a stream of
FUSE messages to an external daemon that actually performs
all the file IO.  The FUSE daemon is an external process in order to
achieve better isolation for security and resource control (e.g. number
of file descriptors) and also because it's cleaner than trying to
integrate libfuse into QEMU.

This device has an extra BAR that contains (up to) 3 regions:

 a) a DAX mapping range ('the cache') - into which QEMU mmap's
    files on behalf of the external daemon; those files are
    then directly mapped by the guest in a way similar to a DAX
    backed file system;  one advantage of this is that multiple
    guests all accessing the same files should all be sharing
    those pages of host cache.

 b) An experimental set of mappings for use by a metadata versioning
    daemon;  this mapping is shared between multiple guests and
    the daemon, but only contains a set of version counters that
    allow a guest to quickly tell if its metadata is stale.

TODO
----

This is the first RFC, we know we have a bunch of things to clear up:

  a) The virtio device specificiation is still in flux and is expected
     to change

  b) We'd like to find ways of reducing the map/unmap latency for DAX

  c) The metadata versioning scheme needs to settle out.

  d) mmap'ing host files has some interesting side effects; for example
     if the file gets truncated by the host and then the guest accesses
     the mapping, KVM can fail the guest hard.

Dr. David Alan Gilbert (6):
  virtio: Add shared memory capability
  virtio-fs: Add cache BAR
  virtio-fs: Add vhost-user slave commands for mapping
  virtio-fs: Fill in slave commands for mapping
  virtio-fs: Allow mapping of meta data version table
  virtio-fs: Allow mapping of journal

Stefan Hajnoczi (1):
  virtio: add vhost-user-fs-pci device

 configure                                   |  10 +
 contrib/libvhost-user/libvhost-user.h       |   3 +
 docs/interop/vhost-user.txt                 |  35 ++
 hw/virtio/Makefile.objs                     |   1 +
 hw/virtio/vhost-user-fs.c                   | 517 ++++++++++++++++++++
 hw/virtio/vhost-user.c                      |  16 +
 hw/virtio/virtio-pci.c                      | 115 +++++
 hw/virtio/virtio-pci.h                      |  19 +
 include/hw/pci/pci.h                        |   1 +
 include/hw/virtio/vhost-user-fs.h           |  79 +++
 include/standard-headers/linux/virtio_fs.h  |  48 ++
 include/standard-headers/linux/virtio_ids.h |   1 +
 include/standard-headers/linux/virtio_pci.h |   9 +
 13 files changed, 854 insertions(+)
 create mode 100644 hw/virtio/vhost-user-fs.c
 create mode 100644 include/hw/virtio/vhost-user-fs.h
 create mode 100644 include/standard-headers/linux/virtio_fs.h

-- 
2.19.2


Re: [Qemu-devel] [RFC PATCH 0/7] virtio-fs: shared file system for virtual machines3
Posted by no-reply@patchew.org 5 years, 4 months ago
Patchew URL: https://patchew.org/QEMU/20181210173151.16629-1-dgilbert@redhat.com/



Hi,

This series failed the docker-mingw@fedora build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
time make docker-test-mingw@fedora SHOW_ENV=1 J=8
=== TEST SCRIPT END ===

  CC      ui/gtk.o
  CC      chardev/char.o
  CC      chardev/char-console.o
/tmp/qemu-test/src/hw/virtio/virtio-pci.c:1167:12: error: 'virtio_pci_add_shm_cap' defined but not used [-Werror=unused-function]
 static int virtio_pci_add_shm_cap(VirtIOPCIProxy *proxy,
            ^~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors


The full log is available at
http://patchew.org/logs/20181210173151.16629-1-dgilbert@redhat.com/testing.docker-mingw@fedora/?type=message.
---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-devel@redhat.com
Re: [Qemu-devel] [RFC PATCH 0/7] virtio-fs: shared file system for virtual machines3
Posted by jiangyiwen 5 years, 3 months ago
On 2018/12/11 1:31, Dr. David Alan Gilbert (git) wrote:
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> 
> Hi,
>   This is the first RFC for the QEMU side of 'virtio-fs';
> a new mechanism for mounting host directories into the guest
> in a fast, consistent and secure manner.  Our primary use
> case is kata containers, but it should be usable in other scenarios
> as well.
> 
> There are corresponding patches being posted to Linux kernel,
> libfuse and kata lists.
> 
> For a fuller design description, and benchmark numbers, please see
> Vivek's posting of the kernel set here:
> 
> https://marc.info/?l=linux-kernel&m=154446243024251&w=2
> 
> We've got a small website with instructions on how to use it, here:
> 
> https://virtio-fs.gitlab.io/
> 
> and all the code is available on gitlab at:
> 
> https://gitlab.com/virtio-fs
> 
> QEMU's changes
> --------------
> 
> The QEMU changes are pretty small; 
> 
> There's a new vhost-user device, which is used to carry a stream of
> FUSE messages to an external daemon that actually performs
> all the file IO.  The FUSE daemon is an external process in order to
> achieve better isolation for security and resource control (e.g. number
> of file descriptors) and also because it's cleaner than trying to
> integrate libfuse into QEMU.
> 
> This device has an extra BAR that contains (up to) 3 regions:
> 
>  a) a DAX mapping range ('the cache') - into which QEMU mmap's
>     files on behalf of the external daemon; those files are
>     then directly mapped by the guest in a way similar to a DAX
>     backed file system;  one advantage of this is that multiple
>     guests all accessing the same files should all be sharing
>     those pages of host cache.
> 
>  b) An experimental set of mappings for use by a metadata versioning
>     daemon;  this mapping is shared between multiple guests and
>     the daemon, but only contains a set of version counters that
>     allow a guest to quickly tell if its metadata is stale.
> 
> TODO
> ----
> 
> This is the first RFC, we know we have a bunch of things to clear up:
> 
>   a) The virtio device specificiation is still in flux and is expected
>      to change
> 
>   b) We'd like to find ways of reducing the map/unmap latency for DAX
> 
>   c) The metadata versioning scheme needs to settle out.
> 
>   d) mmap'ing host files has some interesting side effects; for example
>      if the file gets truncated by the host and then the guest accesses
>      the mapping, KVM can fail the guest hard.
> 
> Dr. David Alan Gilbert (6):
>   virtio: Add shared memory capability
>   virtio-fs: Add cache BAR
>   virtio-fs: Add vhost-user slave commands for mapping
>   virtio-fs: Fill in slave commands for mapping
>   virtio-fs: Allow mapping of meta data version table
>   virtio-fs: Allow mapping of journal
> 
> Stefan Hajnoczi (1):
>   virtio: add vhost-user-fs-pci device
> 
>  configure                                   |  10 +
>  contrib/libvhost-user/libvhost-user.h       |   3 +
>  docs/interop/vhost-user.txt                 |  35 ++
>  hw/virtio/Makefile.objs                     |   1 +
>  hw/virtio/vhost-user-fs.c                   | 517 ++++++++++++++++++++
>  hw/virtio/vhost-user.c                      |  16 +
>  hw/virtio/virtio-pci.c                      | 115 +++++
>  hw/virtio/virtio-pci.h                      |  19 +
>  include/hw/pci/pci.h                        |   1 +
>  include/hw/virtio/vhost-user-fs.h           |  79 +++
>  include/standard-headers/linux/virtio_fs.h  |  48 ++
>  include/standard-headers/linux/virtio_ids.h |   1 +
>  include/standard-headers/linux/virtio_pci.h |   9 +
>  13 files changed, 854 insertions(+)
>  create mode 100644 hw/virtio/vhost-user-fs.c
>  create mode 100644 include/hw/virtio/vhost-user-fs.h
>  create mode 100644 include/standard-headers/linux/virtio_fs.h
> 

Hi Dave,

I encounter a problem after running qemu with virtio-fs,

I find I only can mount virtio-fs using the following command:
mount -t virtio_fs /dev/null /mnt/virtio_fs/ -o tag=myfs,rootmode=040000,user_id=0,group_id=0
or mount -t virtio_fs /dev/null /mnt/virtio_fs/ -o tag=myfs,rootmode=040000,user_id=0,group_id=0,dax

Then, I want to know how to use "cache=always" or "cache=none", even "cache=auto", "cache=writeback"?

Thanks,
Yiwen.


Re: [Qemu-devel] [RFC PATCH 0/7] virtio-fs: shared file system for virtual machines3
Posted by Vivek Goyal 5 years, 3 months ago
On Sat, Dec 22, 2018 at 05:27:28PM +0800, jiangyiwen wrote:
> On 2018/12/11 1:31, Dr. David Alan Gilbert (git) wrote:
> > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > 
> > Hi,
> >   This is the first RFC for the QEMU side of 'virtio-fs';
> > a new mechanism for mounting host directories into the guest
> > in a fast, consistent and secure manner.  Our primary use
> > case is kata containers, but it should be usable in other scenarios
> > as well.
> > 
> > There are corresponding patches being posted to Linux kernel,
> > libfuse and kata lists.
> > 
> > For a fuller design description, and benchmark numbers, please see
> > Vivek's posting of the kernel set here:
> > 
> > https://marc.info/?l=linux-kernel&m=154446243024251&w=2
> > 
> > We've got a small website with instructions on how to use it, here:
> > 
> > https://virtio-fs.gitlab.io/
> > 
> > and all the code is available on gitlab at:
> > 
> > https://gitlab.com/virtio-fs
> > 
> > QEMU's changes
> > --------------
> > 
> > The QEMU changes are pretty small; 
> > 
> > There's a new vhost-user device, which is used to carry a stream of
> > FUSE messages to an external daemon that actually performs
> > all the file IO.  The FUSE daemon is an external process in order to
> > achieve better isolation for security and resource control (e.g. number
> > of file descriptors) and also because it's cleaner than trying to
> > integrate libfuse into QEMU.
> > 
> > This device has an extra BAR that contains (up to) 3 regions:
> > 
> >  a) a DAX mapping range ('the cache') - into which QEMU mmap's
> >     files on behalf of the external daemon; those files are
> >     then directly mapped by the guest in a way similar to a DAX
> >     backed file system;  one advantage of this is that multiple
> >     guests all accessing the same files should all be sharing
> >     those pages of host cache.
> > 
> >  b) An experimental set of mappings for use by a metadata versioning
> >     daemon;  this mapping is shared between multiple guests and
> >     the daemon, but only contains a set of version counters that
> >     allow a guest to quickly tell if its metadata is stale.
> > 
> > TODO
> > ----
> > 
> > This is the first RFC, we know we have a bunch of things to clear up:
> > 
> >   a) The virtio device specificiation is still in flux and is expected
> >      to change
> > 
> >   b) We'd like to find ways of reducing the map/unmap latency for DAX
> > 
> >   c) The metadata versioning scheme needs to settle out.
> > 
> >   d) mmap'ing host files has some interesting side effects; for example
> >      if the file gets truncated by the host and then the guest accesses
> >      the mapping, KVM can fail the guest hard.
> > 
> > Dr. David Alan Gilbert (6):
> >   virtio: Add shared memory capability
> >   virtio-fs: Add cache BAR
> >   virtio-fs: Add vhost-user slave commands for mapping
> >   virtio-fs: Fill in slave commands for mapping
> >   virtio-fs: Allow mapping of meta data version table
> >   virtio-fs: Allow mapping of journal
> > 
> > Stefan Hajnoczi (1):
> >   virtio: add vhost-user-fs-pci device
> > 
> >  configure                                   |  10 +
> >  contrib/libvhost-user/libvhost-user.h       |   3 +
> >  docs/interop/vhost-user.txt                 |  35 ++
> >  hw/virtio/Makefile.objs                     |   1 +
> >  hw/virtio/vhost-user-fs.c                   | 517 ++++++++++++++++++++
> >  hw/virtio/vhost-user.c                      |  16 +
> >  hw/virtio/virtio-pci.c                      | 115 +++++
> >  hw/virtio/virtio-pci.h                      |  19 +
> >  include/hw/pci/pci.h                        |   1 +
> >  include/hw/virtio/vhost-user-fs.h           |  79 +++
> >  include/standard-headers/linux/virtio_fs.h  |  48 ++
> >  include/standard-headers/linux/virtio_ids.h |   1 +
> >  include/standard-headers/linux/virtio_pci.h |   9 +
> >  13 files changed, 854 insertions(+)
> >  create mode 100644 hw/virtio/vhost-user-fs.c
> >  create mode 100644 include/hw/virtio/vhost-user-fs.h
> >  create mode 100644 include/standard-headers/linux/virtio_fs.h
> > 
> 
> Hi Dave,
> 
> I encounter a problem after running qemu with virtio-fs,
> 
> I find I only can mount virtio-fs using the following command:
> mount -t virtio_fs /dev/null /mnt/virtio_fs/ -o tag=myfs,rootmode=040000,user_id=0,group_id=0
> or mount -t virtio_fs /dev/null /mnt/virtio_fs/ -o tag=myfs,rootmode=040000,user_id=0,group_id=0,dax
> 
> Then, I want to know how to use "cache=always" or "cache=none", even "cache=auto", "cache=writeback"?
> 
> Thanks,
> Yiwen.

Hi Yiwen,

As of now, cache options are libfuse daemon options. So while starting
daemon, specify "-o cache=none" or "-o cache=always" etc. One can not
specify caching option at virtio-fs mount time.

Thanks
Vivek

Re: [Qemu-devel] [RFC PATCH 0/7] virtio-fs: shared file system for virtual machines3
Posted by jiangyiwen 5 years, 3 months ago
On 2018/12/27 3:08, Vivek Goyal wrote:
> On Sat, Dec 22, 2018 at 05:27:28PM +0800, jiangyiwen wrote:
>> On 2018/12/11 1:31, Dr. David Alan Gilbert (git) wrote:
>>> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
>>>
>>> Hi,
>>>   This is the first RFC for the QEMU side of 'virtio-fs';
>>> a new mechanism for mounting host directories into the guest
>>> in a fast, consistent and secure manner.  Our primary use
>>> case is kata containers, but it should be usable in other scenarios
>>> as well.
>>>
>>> There are corresponding patches being posted to Linux kernel,
>>> libfuse and kata lists.
>>>
>>> For a fuller design description, and benchmark numbers, please see
>>> Vivek's posting of the kernel set here:
>>>
>>> https://marc.info/?l=linux-kernel&m=154446243024251&w=2
>>>
>>> We've got a small website with instructions on how to use it, here:
>>>
>>> https://virtio-fs.gitlab.io/
>>>
>>> and all the code is available on gitlab at:
>>>
>>> https://gitlab.com/virtio-fs
>>>
>>> QEMU's changes
>>> --------------
>>>
>>> The QEMU changes are pretty small; 
>>>
>>> There's a new vhost-user device, which is used to carry a stream of
>>> FUSE messages to an external daemon that actually performs
>>> all the file IO.  The FUSE daemon is an external process in order to
>>> achieve better isolation for security and resource control (e.g. number
>>> of file descriptors) and also because it's cleaner than trying to
>>> integrate libfuse into QEMU.
>>>
>>> This device has an extra BAR that contains (up to) 3 regions:
>>>
>>>  a) a DAX mapping range ('the cache') - into which QEMU mmap's
>>>     files on behalf of the external daemon; those files are
>>>     then directly mapped by the guest in a way similar to a DAX
>>>     backed file system;  one advantage of this is that multiple
>>>     guests all accessing the same files should all be sharing
>>>     those pages of host cache.
>>>
>>>  b) An experimental set of mappings for use by a metadata versioning
>>>     daemon;  this mapping is shared between multiple guests and
>>>     the daemon, but only contains a set of version counters that
>>>     allow a guest to quickly tell if its metadata is stale.
>>>
>>> TODO
>>> ----
>>>
>>> This is the first RFC, we know we have a bunch of things to clear up:
>>>
>>>   a) The virtio device specificiation is still in flux and is expected
>>>      to change
>>>
>>>   b) We'd like to find ways of reducing the map/unmap latency for DAX
>>>
>>>   c) The metadata versioning scheme needs to settle out.
>>>
>>>   d) mmap'ing host files has some interesting side effects; for example
>>>      if the file gets truncated by the host and then the guest accesses
>>>      the mapping, KVM can fail the guest hard.
>>>
>>> Dr. David Alan Gilbert (6):
>>>   virtio: Add shared memory capability
>>>   virtio-fs: Add cache BAR
>>>   virtio-fs: Add vhost-user slave commands for mapping
>>>   virtio-fs: Fill in slave commands for mapping
>>>   virtio-fs: Allow mapping of meta data version table
>>>   virtio-fs: Allow mapping of journal
>>>
>>> Stefan Hajnoczi (1):
>>>   virtio: add vhost-user-fs-pci device
>>>
>>>  configure                                   |  10 +
>>>  contrib/libvhost-user/libvhost-user.h       |   3 +
>>>  docs/interop/vhost-user.txt                 |  35 ++
>>>  hw/virtio/Makefile.objs                     |   1 +
>>>  hw/virtio/vhost-user-fs.c                   | 517 ++++++++++++++++++++
>>>  hw/virtio/vhost-user.c                      |  16 +
>>>  hw/virtio/virtio-pci.c                      | 115 +++++
>>>  hw/virtio/virtio-pci.h                      |  19 +
>>>  include/hw/pci/pci.h                        |   1 +
>>>  include/hw/virtio/vhost-user-fs.h           |  79 +++
>>>  include/standard-headers/linux/virtio_fs.h  |  48 ++
>>>  include/standard-headers/linux/virtio_ids.h |   1 +
>>>  include/standard-headers/linux/virtio_pci.h |   9 +
>>>  13 files changed, 854 insertions(+)
>>>  create mode 100644 hw/virtio/vhost-user-fs.c
>>>  create mode 100644 include/hw/virtio/vhost-user-fs.h
>>>  create mode 100644 include/standard-headers/linux/virtio_fs.h
>>>
>>
>> Hi Dave,
>>
>> I encounter a problem after running qemu with virtio-fs,
>>
>> I find I only can mount virtio-fs using the following command:
>> mount -t virtio_fs /dev/null /mnt/virtio_fs/ -o tag=myfs,rootmode=040000,user_id=0,group_id=0
>> or mount -t virtio_fs /dev/null /mnt/virtio_fs/ -o tag=myfs,rootmode=040000,user_id=0,group_id=0,dax
>>
>> Then, I want to know how to use "cache=always" or "cache=none", even "cache=auto", "cache=writeback"?
>>
>> Thanks,
>> Yiwen.
> 
> Hi Yiwen,
> 
> As of now, cache options are libfuse daemon options. So while starting
> daemon, specify "-o cache=none" or "-o cache=always" etc. One can not
> specify caching option at virtio-fs mount time.
> 
> Thanks
> Vivek
> 
> .
> 

Ok, I get it, thanks.

Yiwen.


Re: [Qemu-devel] [RFC PATCH 0/7] virtio-fs: shared file system for virtual machines3
Posted by Greg Kurz 5 years ago
On Mon, 10 Dec 2018 17:31:44 +0000
"Dr. David Alan Gilbert (git)" <dgilbert@redhat.com> wrote:

> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> 
> Hi,
>   This is the first RFC for the QEMU side of 'virtio-fs';
> a new mechanism for mounting host directories into the guest
> in a fast, consistent and secure manner.  Our primary use
> case is kata containers, but it should be usable in other scenarios
> as well.
> 
> There are corresponding patches being posted to Linux kernel,
> libfuse and kata lists.
> 
> For a fuller design description, and benchmark numbers, please see
> Vivek's posting of the kernel set here:
> 
> https://marc.info/?l=linux-kernel&m=154446243024251&w=2
> 
> We've got a small website with instructions on how to use it, here:
> 
> https://virtio-fs.gitlab.io/
> 
> and all the code is available on gitlab at:
> 
> https://gitlab.com/virtio-fs
> 

Hi !

This looks like a very promising replacement for virtio-9p, at
least with better chances of reaching a production quality level.

Not sure I'll have enough time to step in, but please Cc me on
future posts. As virtio-9p maintainer, I'll be happy to help if
I can. Also I'll be happy to get rid of the fsdev proxy backend
at some point (which I already wanted to replace with a vhost
user based solution :-) ).

Cheers,

--
Greg

> QEMU's changes
> --------------
> 
> The QEMU changes are pretty small; 
> 
> There's a new vhost-user device, which is used to carry a stream of
> FUSE messages to an external daemon that actually performs
> all the file IO.  The FUSE daemon is an external process in order to
> achieve better isolation for security and resource control (e.g. number
> of file descriptors) and also because it's cleaner than trying to
> integrate libfuse into QEMU.
> 
> This device has an extra BAR that contains (up to) 3 regions:
> 
>  a) a DAX mapping range ('the cache') - into which QEMU mmap's
>     files on behalf of the external daemon; those files are
>     then directly mapped by the guest in a way similar to a DAX
>     backed file system;  one advantage of this is that multiple
>     guests all accessing the same files should all be sharing
>     those pages of host cache.
> 
>  b) An experimental set of mappings for use by a metadata versioning
>     daemon;  this mapping is shared between multiple guests and
>     the daemon, but only contains a set of version counters that
>     allow a guest to quickly tell if its metadata is stale.
> 
> TODO
> ----
> 
> This is the first RFC, we know we have a bunch of things to clear up:
> 
>   a) The virtio device specificiation is still in flux and is expected
>      to change
> 
>   b) We'd like to find ways of reducing the map/unmap latency for DAX
> 
>   c) The metadata versioning scheme needs to settle out.
> 
>   d) mmap'ing host files has some interesting side effects; for example
>      if the file gets truncated by the host and then the guest accesses
>      the mapping, KVM can fail the guest hard.
> 
> Dr. David Alan Gilbert (6):
>   virtio: Add shared memory capability
>   virtio-fs: Add cache BAR
>   virtio-fs: Add vhost-user slave commands for mapping
>   virtio-fs: Fill in slave commands for mapping
>   virtio-fs: Allow mapping of meta data version table
>   virtio-fs: Allow mapping of journal
> 
> Stefan Hajnoczi (1):
>   virtio: add vhost-user-fs-pci device
> 
>  configure                                   |  10 +
>  contrib/libvhost-user/libvhost-user.h       |   3 +
>  docs/interop/vhost-user.txt                 |  35 ++
>  hw/virtio/Makefile.objs                     |   1 +
>  hw/virtio/vhost-user-fs.c                   | 517 ++++++++++++++++++++
>  hw/virtio/vhost-user.c                      |  16 +
>  hw/virtio/virtio-pci.c                      | 115 +++++
>  hw/virtio/virtio-pci.h                      |  19 +
>  include/hw/pci/pci.h                        |   1 +
>  include/hw/virtio/vhost-user-fs.h           |  79 +++
>  include/standard-headers/linux/virtio_fs.h  |  48 ++
>  include/standard-headers/linux/virtio_ids.h |   1 +
>  include/standard-headers/linux/virtio_pci.h |   9 +
>  13 files changed, 854 insertions(+)
>  create mode 100644 hw/virtio/vhost-user-fs.c
>  create mode 100644 include/hw/virtio/vhost-user-fs.h
>  create mode 100644 include/standard-headers/linux/virtio_fs.h
> 


Re: [Qemu-devel] [RFC PATCH 0/7] virtio-fs: shared file system for virtual machines3
Posted by Dr. David Alan Gilbert 5 years ago
* Greg Kurz (groug@kaod.org) wrote:
> On Mon, 10 Dec 2018 17:31:44 +0000
> "Dr. David Alan Gilbert (git)" <dgilbert@redhat.com> wrote:
> 
> > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > 
> > Hi,
> >   This is the first RFC for the QEMU side of 'virtio-fs';
> > a new mechanism for mounting host directories into the guest
> > in a fast, consistent and secure manner.  Our primary use
> > case is kata containers, but it should be usable in other scenarios
> > as well.
> > 
> > There are corresponding patches being posted to Linux kernel,
> > libfuse and kata lists.
> > 
> > For a fuller design description, and benchmark numbers, please see
> > Vivek's posting of the kernel set here:
> > 
> > https://marc.info/?l=linux-kernel&m=154446243024251&w=2
> > 
> > We've got a small website with instructions on how to use it, here:
> > 
> > https://virtio-fs.gitlab.io/
> > 
> > and all the code is available on gitlab at:
> > 
> > https://gitlab.com/virtio-fs
> > 
> 
> Hi !
> 
> This looks like a very promising replacement for virtio-9p, at
> least with better chances of reaching a production quality level.
> 
> Not sure I'll have enough time to step in, but please Cc me on
> future posts. As virtio-9p maintainer, I'll be happy to help if
> I can. Also I'll be happy to get rid of the fsdev proxy backend
> at some point (which I already wanted to replace with a vhost
> user based solution :-) ).

Thanks! We'll try and remember to keep you in the loop.
If there are any gotchas that you tripped over in 9p that we should
watch out for then please give us a prod.

Dave


Dave

> Cheers,
> 
> --
> Greg
> 
> > QEMU's changes
> > --------------
> > 
> > The QEMU changes are pretty small; 
> > 
> > There's a new vhost-user device, which is used to carry a stream of
> > FUSE messages to an external daemon that actually performs
> > all the file IO.  The FUSE daemon is an external process in order to
> > achieve better isolation for security and resource control (e.g. number
> > of file descriptors) and also because it's cleaner than trying to
> > integrate libfuse into QEMU.
> > 
> > This device has an extra BAR that contains (up to) 3 regions:
> > 
> >  a) a DAX mapping range ('the cache') - into which QEMU mmap's
> >     files on behalf of the external daemon; those files are
> >     then directly mapped by the guest in a way similar to a DAX
> >     backed file system;  one advantage of this is that multiple
> >     guests all accessing the same files should all be sharing
> >     those pages of host cache.
> > 
> >  b) An experimental set of mappings for use by a metadata versioning
> >     daemon;  this mapping is shared between multiple guests and
> >     the daemon, but only contains a set of version counters that
> >     allow a guest to quickly tell if its metadata is stale.
> > 
> > TODO
> > ----
> > 
> > This is the first RFC, we know we have a bunch of things to clear up:
> > 
> >   a) The virtio device specificiation is still in flux and is expected
> >      to change
> > 
> >   b) We'd like to find ways of reducing the map/unmap latency for DAX
> > 
> >   c) The metadata versioning scheme needs to settle out.
> > 
> >   d) mmap'ing host files has some interesting side effects; for example
> >      if the file gets truncated by the host and then the guest accesses
> >      the mapping, KVM can fail the guest hard.
> > 
> > Dr. David Alan Gilbert (6):
> >   virtio: Add shared memory capability
> >   virtio-fs: Add cache BAR
> >   virtio-fs: Add vhost-user slave commands for mapping
> >   virtio-fs: Fill in slave commands for mapping
> >   virtio-fs: Allow mapping of meta data version table
> >   virtio-fs: Allow mapping of journal
> > 
> > Stefan Hajnoczi (1):
> >   virtio: add vhost-user-fs-pci device
> > 
> >  configure                                   |  10 +
> >  contrib/libvhost-user/libvhost-user.h       |   3 +
> >  docs/interop/vhost-user.txt                 |  35 ++
> >  hw/virtio/Makefile.objs                     |   1 +
> >  hw/virtio/vhost-user-fs.c                   | 517 ++++++++++++++++++++
> >  hw/virtio/vhost-user.c                      |  16 +
> >  hw/virtio/virtio-pci.c                      | 115 +++++
> >  hw/virtio/virtio-pci.h                      |  19 +
> >  include/hw/pci/pci.h                        |   1 +
> >  include/hw/virtio/vhost-user-fs.h           |  79 +++
> >  include/standard-headers/linux/virtio_fs.h  |  48 ++
> >  include/standard-headers/linux/virtio_ids.h |   1 +
> >  include/standard-headers/linux/virtio_pci.h |   9 +
> >  13 files changed, 854 insertions(+)
> >  create mode 100644 hw/virtio/vhost-user-fs.c
> >  create mode 100644 include/hw/virtio/vhost-user-fs.h
> >  create mode 100644 include/standard-headers/linux/virtio_fs.h
> > 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

Re: [Qemu-devel] [RFC PATCH 0/7] virtio-fs: shared file system for virtual machines3
Posted by Daniel P. Berrangé 5 years, 4 months ago
On Mon, Dec 10, 2018 at 05:31:44PM +0000, Dr. David Alan Gilbert (git) wrote:
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> 
> Hi,
>   This is the first RFC for the QEMU side of 'virtio-fs';
> a new mechanism for mounting host directories into the guest
> in a fast, consistent and secure manner.  Our primary use
> case is kata containers, but it should be usable in other scenarios
> as well.
> 
> There are corresponding patches being posted to Linux kernel,
> libfuse and kata lists.
> 
> For a fuller design description, and benchmark numbers, please see
> Vivek's posting of the kernel set here:
> 
> https://marc.info/?l=linux-kernel&m=154446243024251&w=2
> 
> We've got a small website with instructions on how to use it, here:
> 
> https://virtio-fs.gitlab.io/
> 
> and all the code is available on gitlab at:
> 
> https://gitlab.com/virtio-fs
> 
> QEMU's changes
> --------------
> 
> The QEMU changes are pretty small; 
> 
> There's a new vhost-user device, which is used to carry a stream of
> FUSE messages to an external daemon that actually performs
> all the file IO.  The FUSE daemon is an external process in order to
> achieve better isolation for security and resource control (e.g. number
> of file descriptors) and also because it's cleaner than trying to
> integrate libfuse into QEMU.

Overall I like the virtio-fs architecture more than the virtio-vsock+NFS
approach, as virtio-fs feels simpler and closer to virtio-9p with the
latter's proxy backends.

I never really liked the idea of having to mess around with the host
NFS server to exposed filesystems to guests, as that's systemwide
service.  The ability to have an isolated virtio-fs backend process
per filesystem share per guest is simpler from a mgmt pov.

One think I would like to see though is a general purpose, production
quality backend impl that is shipped by the QEMU project.  It is fine
if projects like Kata want to write a custom impl tailored to their
specific needs, but I think QEMU should have something as standard that
isn't just demoware. 

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [Qemu-devel] [RFC PATCH 0/7] virtio-fs: shared file system for virtual machines3
Posted by Dr. David Alan Gilbert 5 years, 4 months ago
* Daniel P. Berrangé (berrange@redhat.com) wrote:
> On Mon, Dec 10, 2018 at 05:31:44PM +0000, Dr. David Alan Gilbert (git) wrote:
> > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > 
> > Hi,
> >   This is the first RFC for the QEMU side of 'virtio-fs';
> > a new mechanism for mounting host directories into the guest
> > in a fast, consistent and secure manner.  Our primary use
> > case is kata containers, but it should be usable in other scenarios
> > as well.
> > 
> > There are corresponding patches being posted to Linux kernel,
> > libfuse and kata lists.
> > 
> > For a fuller design description, and benchmark numbers, please see
> > Vivek's posting of the kernel set here:
> > 
> > https://marc.info/?l=linux-kernel&m=154446243024251&w=2
> > 
> > We've got a small website with instructions on how to use it, here:
> > 
> > https://virtio-fs.gitlab.io/
> > 
> > and all the code is available on gitlab at:
> > 
> > https://gitlab.com/virtio-fs
> > 
> > QEMU's changes
> > --------------
> > 
> > The QEMU changes are pretty small; 
> > 
> > There's a new vhost-user device, which is used to carry a stream of
> > FUSE messages to an external daemon that actually performs
> > all the file IO.  The FUSE daemon is an external process in order to
> > achieve better isolation for security and resource control (e.g. number
> > of file descriptors) and also because it's cleaner than trying to
> > integrate libfuse into QEMU.
> 
> Overall I like the virtio-fs architecture more than the virtio-vsock+NFS
> approach, as virtio-fs feels simpler and closer to virtio-9p with the
> latter's proxy backends.
> 
> I never really liked the idea of having to mess around with the host
> NFS server to exposed filesystems to guests, as that's systemwide
> service.  The ability to have an isolated virtio-fs backend process
> per filesystem share per guest is simpler from a mgmt pov.
> 
> One think I would like to see though is a general purpose, production
> quality backend impl that is shipped by the QEMU project.  It is fine
> if projects like Kata want to write a custom impl tailored to their
> specific needs, but I think QEMU should have something as standard that
> isn't just demoware. 

Our patches sent to libfuse may provide that - after we tidy them up a
bit more; but it is the result of adding the fuse example code to qemu's
contrib vhost-user example code.    Given that this is the intersection
of so many projects I'm not sure I care which project distributes a
working implementation.

Dave

> Regards,
> Daniel
> -- 
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

Re: [Qemu-devel] [RFC PATCH 0/7] virtio-fs: shared file system for virtual machines3
Posted by Daniel P. Berrangé 5 years, 4 months ago
On Wed, Dec 12, 2018 at 01:52:03PM +0000, Dr. David Alan Gilbert wrote:
> * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > On Mon, Dec 10, 2018 at 05:31:44PM +0000, Dr. David Alan Gilbert (git) wrote:
> > > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > > 
> > > Hi,
> > >   This is the first RFC for the QEMU side of 'virtio-fs';
> > > a new mechanism for mounting host directories into the guest
> > > in a fast, consistent and secure manner.  Our primary use
> > > case is kata containers, but it should be usable in other scenarios
> > > as well.
> > > 
> > > There are corresponding patches being posted to Linux kernel,
> > > libfuse and kata lists.
> > > 
> > > For a fuller design description, and benchmark numbers, please see
> > > Vivek's posting of the kernel set here:
> > > 
> > > https://marc.info/?l=linux-kernel&m=154446243024251&w=2
> > > 
> > > We've got a small website with instructions on how to use it, here:
> > > 
> > > https://virtio-fs.gitlab.io/
> > > 
> > > and all the code is available on gitlab at:
> > > 
> > > https://gitlab.com/virtio-fs
> > > 
> > > QEMU's changes
> > > --------------
> > > 
> > > The QEMU changes are pretty small; 
> > > 
> > > There's a new vhost-user device, which is used to carry a stream of
> > > FUSE messages to an external daemon that actually performs
> > > all the file IO.  The FUSE daemon is an external process in order to
> > > achieve better isolation for security and resource control (e.g. number
> > > of file descriptors) and also because it's cleaner than trying to
> > > integrate libfuse into QEMU.
> > 
> > Overall I like the virtio-fs architecture more than the virtio-vsock+NFS
> > approach, as virtio-fs feels simpler and closer to virtio-9p with the
> > latter's proxy backends.
> > 
> > I never really liked the idea of having to mess around with the host
> > NFS server to exposed filesystems to guests, as that's systemwide
> > service.  The ability to have an isolated virtio-fs backend process
> > per filesystem share per guest is simpler from a mgmt pov.
> > 
> > One think I would like to see though is a general purpose, production
> > quality backend impl that is shipped by the QEMU project.  It is fine
> > if projects like Kata want to write a custom impl tailored to their
> > specific needs, but I think QEMU should have something as standard that
> > isn't just demoware. 
> 
> Our patches sent to libfuse may provide that - after we tidy them up a
> bit more; but it is the result of adding the fuse example code to qemu's
> contrib vhost-user example code.    Given that this is the intersection
> of so many projects I'm not sure I care which project distributes a
> working implementation.

Right, but that's my point - the stuff in QEMU's contrib/ directories is
just demoware - not something we actually support as QEMU maintainers,
nor expect users to run in production. Likewise for stuff in libfuse
example/ directory AFAIK.

IMHO we need something whose support status is on a par with what you'd
get if we had the impl in-process for the main QEMU system emulator.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [Qemu-devel] [RFC PATCH 0/7] virtio-fs: shared file system for virtual machines3
Posted by Stefan Hajnoczi 5 years, 4 months ago
On Wed, Dec 12, 2018 at 01:58:25PM +0000, Daniel P. Berrangé wrote:
> On Wed, Dec 12, 2018 at 01:52:03PM +0000, Dr. David Alan Gilbert wrote:
> IMHO we need something whose support status is on a par with what you'd
> get if we had the impl in-process for the main QEMU system emulator.

I agree.  Now that virtio-fs has been released we're working on todo
items that will make the libfuse code production-quality, including
security auditing and jailing of the process.

Once we're confident that this is a production-quality file server it's
a matter of moving it out of example/ or contrib/.  We might find that
the scope of a production-quality file server exceeds libfuse's example/
anyway and need to move it to a new home.

Stefan
Re: [Qemu-devel] [RFC PATCH 0/7] virtio-fs: shared file system for virtual machines3
Posted by Stefan Hajnoczi 5 years, 4 months ago
On Mon, Dec 10, 2018 at 05:31:44PM +0000, Dr. David Alan Gilbert (git) wrote:
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> 
> Hi,
>   This is the first RFC for the QEMU side of 'virtio-fs';
> a new mechanism for mounting host directories into the guest
> in a fast, consistent and secure manner.  Our primary use
> case is kata containers, but it should be usable in other scenarios
> as well.
> 
> There are corresponding patches being posted to Linux kernel,
> libfuse and kata lists.
> 
> For a fuller design description, and benchmark numbers, please see
> Vivek's posting of the kernel set here:
> 
> https://marc.info/?l=linux-kernel&m=154446243024251&w=2
> 
> We've got a small website with instructions on how to use it, here:
> 
> https://virtio-fs.gitlab.io/
> 
> and all the code is available on gitlab at:
> 
> https://gitlab.com/virtio-fs

A draft specification for the virtio-fs device is available here:

https://stefanha.github.io/virtio/virtio-fs.html#x1-38800010 (HTML)

https://github.com/stefanha/virtio/commit/e1cac3777ef03bc9c5c8ee91bcc6ba478272e6b6

Stefan