[PATCH 0/2] RFC: add -mem-shared option

Marc-André Lureau posted 2 patches 4 years, 3 months ago
Test asan passed
Test checkpatch failed
Test FreeBSD passed
Test docker-mingw@fedora passed
Test docker-clang@ubuntu passed
Test docker-quick@centos7 failed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20191128141518.628245-1-marcandre.lureau@redhat.com
Maintainers: Richard Henderson <rth@twiddle.net>, Marcel Apfelbaum <marcel.apfelbaum@gmail.com>, Eduardo Habkost <ehabkost@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>
exec.c                  | 11 ++++++++++-
hw/core/numa.c          | 16 +++++++++++++++-
include/qemu/memfd.h    |  3 +++
include/sysemu/sysemu.h |  1 +
qemu-options.hx         | 10 ++++++++++
util/memfd.c            | 39 +++++++++++++++++++++++++--------------
vl.c                    |  4 ++++
7 files changed, 68 insertions(+), 16 deletions(-)
[PATCH 0/2] RFC: add -mem-shared option
Posted by Marc-André Lureau 4 years, 3 months ago
Hi,

Setting up shared memory for vhost-user is a bit complicated from
command line, as it requires NUMA setup such as: m 4G -object
memory-backend-file,id=mem,size=4G,mem-path=/dev/shm,share=on -numa
node,memdev=mem.

Instead, I suggest to add a -mem-shared option for non-numa setups,
that will make the -mem-path or anonymouse memory shareable.

Comments welcome,

Marc-André Lureau (2):
  memfd: add qemu_memfd_open()
  Add -mem-shared option

 exec.c                  | 11 ++++++++++-
 hw/core/numa.c          | 16 +++++++++++++++-
 include/qemu/memfd.h    |  3 +++
 include/sysemu/sysemu.h |  1 +
 qemu-options.hx         | 10 ++++++++++
 util/memfd.c            | 39 +++++++++++++++++++++++++--------------
 vl.c                    |  4 ++++
 7 files changed, 68 insertions(+), 16 deletions(-)

-- 
2.24.0


Re: [PATCH 0/2] RFC: add -mem-shared option
Posted by Dr. David Alan Gilbert 4 years, 3 months ago
* Marc-André Lureau (marcandre.lureau@redhat.com) wrote:
> Hi,
> 
> Setting up shared memory for vhost-user is a bit complicated from
> command line, as it requires NUMA setup such as: m 4G -object
> memory-backend-file,id=mem,size=4G,mem-path=/dev/shm,share=on -numa
> node,memdev=mem.
> 
> Instead, I suggest to add a -mem-shared option for non-numa setups,
> that will make the -mem-path or anonymouse memory shareable.
> 
> Comments welcome,

It's worth checking with Igor (cc'd) - he said he was going to work on
something similar.

One other thing this fixes is that it lets you potentially do vhost-user
on s390, since it currently has no NUMA.

Dave

> Marc-André Lureau (2):
>   memfd: add qemu_memfd_open()
>   Add -mem-shared option
> 
>  exec.c                  | 11 ++++++++++-
>  hw/core/numa.c          | 16 +++++++++++++++-
>  include/qemu/memfd.h    |  3 +++
>  include/sysemu/sysemu.h |  1 +
>  qemu-options.hx         | 10 ++++++++++
>  util/memfd.c            | 39 +++++++++++++++++++++++++--------------
>  vl.c                    |  4 ++++
>  7 files changed, 68 insertions(+), 16 deletions(-)
> 
> -- 
> 2.24.0
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


Re: [PATCH 0/2] RFC: add -mem-shared option
Posted by Igor Mammedov 4 years, 3 months ago
On Thu, 28 Nov 2019 16:59:33 +0000
"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:

> * Marc-André Lureau (marcandre.lureau@redhat.com) wrote:
> > Hi,
> > 
> > Setting up shared memory for vhost-user is a bit complicated from
> > command line, as it requires NUMA setup such as: m 4G -object
> > memory-backend-file,id=mem,size=4G,mem-path=/dev/shm,share=on -numa
> > node,memdev=mem.
> > 
> > Instead, I suggest to add a -mem-shared option for non-numa setups,
> > that will make the -mem-path or anonymouse memory shareable.
> > 
> > Comments welcome,  
> 
> It's worth checking with Igor (cc'd) - he said he was going to work on
> something similar.
> 
> One other thing this fixes is that it lets you potentially do vhost-user
> on s390, since it currently has no NUMA.
Switching to memdev will let vhost-user on s390 work as well.
This is convenience option and workarounds inability to set main RAM
properties in current impl. 


> Dave
> 
> > Marc-André Lureau (2):
> >   memfd: add qemu_memfd_open()
> >   Add -mem-shared option
> > 
> >  exec.c                  | 11 ++++++++++-
> >  hw/core/numa.c          | 16 +++++++++++++++-
> >  include/qemu/memfd.h    |  3 +++
> >  include/sysemu/sysemu.h |  1 +
> >  qemu-options.hx         | 10 ++++++++++
> >  util/memfd.c            | 39 +++++++++++++++++++++++++--------------
> >  vl.c                    |  4 ++++
> >  7 files changed, 68 insertions(+), 16 deletions(-)
> > 
> > -- 
> > 2.24.0
> > 
> >   
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> 
> 


Re: [PATCH 0/2] RFC: add -mem-shared option
Posted by Stefan Hajnoczi 4 years, 3 months ago
On Fri, Nov 29, 2019 at 10:23:25AM +0100, Igor Mammedov wrote:
> On Thu, 28 Nov 2019 16:59:33 +0000
> "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> 
> > * Marc-André Lureau (marcandre.lureau@redhat.com) wrote:
> > > Hi,
> > > 
> > > Setting up shared memory for vhost-user is a bit complicated from
> > > command line, as it requires NUMA setup such as: m 4G -object
> > > memory-backend-file,id=mem,size=4G,mem-path=/dev/shm,share=on -numa
> > > node,memdev=mem.
> > > 
> > > Instead, I suggest to add a -mem-shared option for non-numa setups,
> > > that will make the -mem-path or anonymouse memory shareable.
> > > 
> > > Comments welcome,  
> > 
> > It's worth checking with Igor (cc'd) - he said he was going to work on
> > something similar.
> > 
> > One other thing this fixes is that it lets you potentially do vhost-user
> > on s390, since it currently has no NUMA.
> Switching to memdev will let vhost-user on s390 work as well.
> This is convenience option and workarounds inability to set main RAM
> properties in current impl. 

Gong Su asked about virtio-fs (vhost-user) on s390.  This patch series
might be the first step to enabling it.

Stefan
Re: [PATCH 0/2] RFC: add -mem-shared option
Posted by Igor Mammedov 4 years, 3 months ago
On Fri, 13 Dec 2019 11:39:57 +0000
Stefan Hajnoczi <stefanha@redhat.com> wrote:

> On Fri, Nov 29, 2019 at 10:23:25AM +0100, Igor Mammedov wrote:
> > On Thu, 28 Nov 2019 16:59:33 +0000
> > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> >   
> > > * Marc-André Lureau (marcandre.lureau@redhat.com) wrote:  
> > > > Hi,
> > > > 
> > > > Setting up shared memory for vhost-user is a bit complicated from
> > > > command line, as it requires NUMA setup such as: m 4G -object
> > > > memory-backend-file,id=mem,size=4G,mem-path=/dev/shm,share=on -numa
> > > > node,memdev=mem.
> > > > 
> > > > Instead, I suggest to add a -mem-shared option for non-numa setups,
> > > > that will make the -mem-path or anonymouse memory shareable.
> > > > 
> > > > Comments welcome,    
> > > 
> > > It's worth checking with Igor (cc'd) - he said he was going to work on
> > > something similar.
> > > 
> > > One other thing this fixes is that it lets you potentially do vhost-user
> > > on s390, since it currently has no NUMA.  
> > Switching to memdev will let vhost-user on s390 work as well.
> > This is convenience option and workarounds inability to set main RAM
> > properties in current impl.   
> 
> Gong Su asked about virtio-fs (vhost-user) on s390.  This patch series
> might be the first step to enabling it.

I'm preparing(resplitting/cleaning up) series that will switch main RAM
to memdev backend.

(
I'd prefer to post complete series that does conversion across all boards.
But if it's pressing, I surely can post several patches to enable it for s390
and get some early feedback on approach
)


> 
> Stefan


Re: [PATCH 0/2] RFC: add -mem-shared option
Posted by no-reply@patchew.org 4 years, 3 months ago
Patchew URL: https://patchew.org/QEMU/20191128141518.628245-1-marcandre.lureau@redhat.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Subject: [PATCH 0/2] RFC: add -mem-shared option
Type: series
Message-id: 20191128141518.628245-1-marcandre.lureau@redhat.com

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
e2537da Add -mem-shared option
623d044 memfd: add qemu_memfd_open()

=== OUTPUT BEGIN ===
1/2 Checking commit 623d044023d1 (memfd: add qemu_memfd_open())
2/2 Checking commit e2537da34663 (Add -mem-shared option)
ERROR: do not initialise globals to 0 or NULL
#123: FILE: vl.c:146:
+int mem_shared = 0;

total: 1 errors, 0 warnings, 90 lines checked

Patch 2/2 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/20191128141518.628245-1-marcandre.lureau@redhat.com/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com
Re: [PATCH 0/2] RFC: add -mem-shared option
Posted by Gerd Hoffmann 4 years, 3 months ago
On Thu, Nov 28, 2019 at 06:15:16PM +0400, Marc-André Lureau wrote:
> Hi,
> 
> Setting up shared memory for vhost-user is a bit complicated from
> command line, as it requires NUMA setup such as: m 4G -object
> memory-backend-file,id=mem,size=4G,mem-path=/dev/shm,share=on -numa
> node,memdev=mem.
> 
> Instead, I suggest to add a -mem-shared option for non-numa setups,
> that will make the -mem-path or anonymouse memory shareable.

Is it an option to just use memfd by default (if available)?

cheers,
  Gerd


Re: [PATCH 0/2] RFC: add -mem-shared option
Posted by Marc-André Lureau 4 years, 3 months ago
Hi

On Fri, Nov 29, 2019 at 11:03 AM Gerd Hoffmann <kraxel@redhat.com> wrote:
>
> On Thu, Nov 28, 2019 at 06:15:16PM +0400, Marc-André Lureau wrote:
> > Hi,
> >
> > Setting up shared memory for vhost-user is a bit complicated from
> > command line, as it requires NUMA setup such as: m 4G -object
> > memory-backend-file,id=mem,size=4G,mem-path=/dev/shm,share=on -numa
> > node,memdev=mem.
> >
> > Instead, I suggest to add a -mem-shared option for non-numa setups,
> > that will make the -mem-path or anonymouse memory shareable.
>
> Is it an option to just use memfd by default (if available)?

Yes, with a fallback path currently using a temporary file under /tmp
(we may want to use shm_open() instead, or a different location such
as XDG_RUNTIME_DIR? - and use O_TMPFILE)


Re: [PATCH 0/2] RFC: add -mem-shared option
Posted by Daniel P. Berrangé 4 years, 3 months ago
On Fri, Nov 29, 2019 at 11:30:51AM +0400, Marc-André Lureau wrote:
> Hi
> 
> On Fri, Nov 29, 2019 at 11:03 AM Gerd Hoffmann <kraxel@redhat.com> wrote:
> >
> > On Thu, Nov 28, 2019 at 06:15:16PM +0400, Marc-André Lureau wrote:
> > > Hi,
> > >
> > > Setting up shared memory for vhost-user is a bit complicated from
> > > command line, as it requires NUMA setup such as: m 4G -object
> > > memory-backend-file,id=mem,size=4G,mem-path=/dev/shm,share=on -numa
> > > node,memdev=mem.
> > >
> > > Instead, I suggest to add a -mem-shared option for non-numa setups,
> > > that will make the -mem-path or anonymouse memory shareable.
> >
> > Is it an option to just use memfd by default (if available)?
> 
> Yes, with a fallback path currently using a temporary file under /tmp
> (we may want to use shm_open() instead, or a different location such
> as XDG_RUNTIME_DIR? - and use O_TMPFILE)

We can't assume either /tmp or XDG_RUNTIME_DIR is on tmpfs as that is no
where near standard across all OS distros, and even if on tmpfs these
dirs can be size limited to a significant subset of available RAM. IOW
we can't safely use them unless explicitly told to.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


Re: [PATCH 0/2] RFC: add -mem-shared option
Posted by Marc-André Lureau 4 years, 3 months ago
Hi

On Fri, Nov 29, 2019 at 1:27 PM Daniel P. Berrangé <berrange@redhat.com> wrote:
>
> On Fri, Nov 29, 2019 at 11:30:51AM +0400, Marc-André Lureau wrote:
> > Hi
> >
> > On Fri, Nov 29, 2019 at 11:03 AM Gerd Hoffmann <kraxel@redhat.com> wrote:
> > >
> > > On Thu, Nov 28, 2019 at 06:15:16PM +0400, Marc-André Lureau wrote:
> > > > Hi,
> > > >
> > > > Setting up shared memory for vhost-user is a bit complicated from
> > > > command line, as it requires NUMA setup such as: m 4G -object
> > > > memory-backend-file,id=mem,size=4G,mem-path=/dev/shm,share=on -numa
> > > > node,memdev=mem.
> > > >
> > > > Instead, I suggest to add a -mem-shared option for non-numa setups,
> > > > that will make the -mem-path or anonymouse memory shareable.
> > >
> > > Is it an option to just use memfd by default (if available)?
> >
> > Yes, with a fallback path currently using a temporary file under /tmp
> > (we may want to use shm_open() instead, or a different location such
> > as XDG_RUNTIME_DIR? - and use O_TMPFILE)
>
> We can't assume either /tmp or XDG_RUNTIME_DIR is on tmpfs as that is no
> where near standard across all OS distros, and even if on tmpfs these
> dirs can be size limited to a significant subset of available RAM. IOW
> we can't safely use them unless explicitly told to.

Is shm_open() acceptable instead?

Imho, -mem-shared should do a best effort by default, but can fail.


Re: [PATCH 0/2] RFC: add -mem-shared option
Posted by Daniel P. Berrangé 4 years, 3 months ago
On Fri, Nov 29, 2019 at 01:31:11PM +0400, Marc-André Lureau wrote:
> Hi
> 
> On Fri, Nov 29, 2019 at 1:27 PM Daniel P. Berrangé <berrange@redhat.com> wrote:
> >
> > On Fri, Nov 29, 2019 at 11:30:51AM +0400, Marc-André Lureau wrote:
> > > Hi
> > >
> > > On Fri, Nov 29, 2019 at 11:03 AM Gerd Hoffmann <kraxel@redhat.com> wrote:
> > > >
> > > > On Thu, Nov 28, 2019 at 06:15:16PM +0400, Marc-André Lureau wrote:
> > > > > Hi,
> > > > >
> > > > > Setting up shared memory for vhost-user is a bit complicated from
> > > > > command line, as it requires NUMA setup such as: m 4G -object
> > > > > memory-backend-file,id=mem,size=4G,mem-path=/dev/shm,share=on -numa
> > > > > node,memdev=mem.
> > > > >
> > > > > Instead, I suggest to add a -mem-shared option for non-numa setups,
> > > > > that will make the -mem-path or anonymouse memory shareable.
> > > >
> > > > Is it an option to just use memfd by default (if available)?
> > >
> > > Yes, with a fallback path currently using a temporary file under /tmp
> > > (we may want to use shm_open() instead, or a different location such
> > > as XDG_RUNTIME_DIR? - and use O_TMPFILE)
> >
> > We can't assume either /tmp or XDG_RUNTIME_DIR is on tmpfs as that is no
> > where near standard across all OS distros, and even if on tmpfs these
> > dirs can be size limited to a significant subset of available RAM. IOW
> > we can't safely use them unless explicitly told to.
> 
> Is shm_open() acceptable instead?
> 
> Imho, -mem-shared should do a best effort by default, but can fail.

Possibly, but if we're relying on shm_open choosing the path, then
its harder for users to know what files to clean up when QEMU crashes
or otherwise exits wthout a shm_unlink

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


Re: [PATCH 0/2] RFC: add -mem-shared option
Posted by Marc-André Lureau 4 years, 3 months ago
On Fri, Nov 29, 2019 at 1:42 PM Daniel P. Berrangé <berrange@redhat.com> wrote:
>
> On Fri, Nov 29, 2019 at 01:31:11PM +0400, Marc-André Lureau wrote:
> > Hi
> >
> > On Fri, Nov 29, 2019 at 1:27 PM Daniel P. Berrangé <berrange@redhat.com> wrote:
> > >
> > > On Fri, Nov 29, 2019 at 11:30:51AM +0400, Marc-André Lureau wrote:
> > > > Hi
> > > >
> > > > On Fri, Nov 29, 2019 at 11:03 AM Gerd Hoffmann <kraxel@redhat.com> wrote:
> > > > >
> > > > > On Thu, Nov 28, 2019 at 06:15:16PM +0400, Marc-André Lureau wrote:
> > > > > > Hi,
> > > > > >
> > > > > > Setting up shared memory for vhost-user is a bit complicated from
> > > > > > command line, as it requires NUMA setup such as: m 4G -object
> > > > > > memory-backend-file,id=mem,size=4G,mem-path=/dev/shm,share=on -numa
> > > > > > node,memdev=mem.
> > > > > >
> > > > > > Instead, I suggest to add a -mem-shared option for non-numa setups,
> > > > > > that will make the -mem-path or anonymouse memory shareable.
> > > > >
> > > > > Is it an option to just use memfd by default (if available)?
> > > >
> > > > Yes, with a fallback path currently using a temporary file under /tmp
> > > > (we may want to use shm_open() instead, or a different location such
> > > > as XDG_RUNTIME_DIR? - and use O_TMPFILE)
> > >
> > > We can't assume either /tmp or XDG_RUNTIME_DIR is on tmpfs as that is no
> > > where near standard across all OS distros, and even if on tmpfs these
> > > dirs can be size limited to a significant subset of available RAM. IOW
> > > we can't safely use them unless explicitly told to.
> >
> > Is shm_open() acceptable instead?
> >
> > Imho, -mem-shared should do a best effort by default, but can fail.
>
> Possibly, but if we're relying on shm_open choosing the path, then
> its harder for users to know what files to clean up when QEMU crashes
> or otherwise exits wthout a shm_unlink

True, although I think you can call shm_unlink() right after
shm_open() that should limit the risk of those forgotten files.


Re: [PATCH 0/2] RFC: add -mem-shared option
Posted by Gerd Hoffmann 4 years, 3 months ago
  Hi,

> > Possibly, but if we're relying on shm_open choosing the path, then
> > its harder for users to know what files to clean up when QEMU crashes
> > or otherwise exits wthout a shm_unlink
> 
> True, although I think you can call shm_unlink() right after
> shm_open() that should limit the risk of those forgotten files.

Yes, this is possible and pretty standard ...

cheers,
  Gerd


Re: [PATCH 0/2] RFC: add -mem-shared option
Posted by Paolo Bonzini 4 years, 3 months ago
On 29/11/19 10:27, Daniel P. Berrangé wrote:
>> Yes, with a fallback path currently using a temporary file under /tmp
>> (we may want to use shm_open() instead, or a different location such
>> as XDG_RUNTIME_DIR? - and use O_TMPFILE)
> We can't assume either /tmp or XDG_RUNTIME_DIR is on tmpfs as that is no
> where near standard across all OS distros, and even if on tmpfs these
> dirs can be size limited to a significant subset of available RAM. IOW
> we can't safely use them unless explicitly told to.

Agreed, mkstemp+shm_open seems better.  Perhaps this could be done in
hostmem-memfd.c though, basically as a fallback option?  In principle
one could even use getmntent to search for a hugetlbfs mount.

Paolo


Re: [PATCH 0/2] RFC: add -mem-shared option
Posted by Daniel P. Berrangé 4 years, 3 months ago
On Fri, Nov 29, 2019 at 10:33:39AM +0100, Paolo Bonzini wrote:
> On 29/11/19 10:27, Daniel P. Berrangé wrote:
> >> Yes, with a fallback path currently using a temporary file under /tmp
> >> (we may want to use shm_open() instead, or a different location such
> >> as XDG_RUNTIME_DIR? - and use O_TMPFILE)
> > We can't assume either /tmp or XDG_RUNTIME_DIR is on tmpfs as that is no
> > where near standard across all OS distros, and even if on tmpfs these
> > dirs can be size limited to a significant subset of available RAM. IOW
> > we can't safely use them unless explicitly told to.
> 
> Agreed, mkstemp+shm_open seems better.  Perhaps this could be done in
> hostmem-memfd.c though, basically as a fallback option?  In principle
> one could even use getmntent to search for a hugetlbfs mount.

With mkstemp you still need to pick a location, and I don't think it
is clear there is a reliable choice that will always work.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


Re: [PATCH 0/2] RFC: add -mem-shared option
Posted by Paolo Bonzini 4 years, 3 months ago
On 29/11/19 10:39, Daniel P. Berrangé wrote:
> On Fri, Nov 29, 2019 at 10:33:39AM +0100, Paolo Bonzini wrote:
>> On 29/11/19 10:27, Daniel P. Berrangé wrote:
>>>> Yes, with a fallback path currently using a temporary file under /tmp
>>>> (we may want to use shm_open() instead, or a different location such
>>>> as XDG_RUNTIME_DIR? - and use O_TMPFILE)
>>> We can't assume either /tmp or XDG_RUNTIME_DIR is on tmpfs as that is no
>>> where near standard across all OS distros, and even if on tmpfs these
>>> dirs can be size limited to a significant subset of available RAM. IOW
>>> we can't safely use them unless explicitly told to.
>>
>> Agreed, mkstemp+shm_open seems better.  Perhaps this could be done in
>> hostmem-memfd.c though, basically as a fallback option?  In principle
>> one could even use getmntent to search for a hugetlbfs mount.
> 
> With mkstemp you still need to pick a location, and I don't think it
> is clear there is a reliable choice that will always work.

Sorry, I meant mktemp (which is almost never the right choice so my
brain fat-fingered it...) + shm_open(O_CREAT|O_EXCL).

Paolo


Re: [PATCH 0/2] RFC: add -mem-shared option
Posted by Igor Mammedov 4 years, 3 months ago
On Fri, 29 Nov 2019 10:33:39 +0100
Paolo Bonzini <pbonzini@redhat.com> wrote:

> On 29/11/19 10:27, Daniel P. Berrangé wrote:
> >> Yes, with a fallback path currently using a temporary file under /tmp
> >> (we may want to use shm_open() instead, or a different location such
> >> as XDG_RUNTIME_DIR? - and use O_TMPFILE)  
> > We can't assume either /tmp or XDG_RUNTIME_DIR is on tmpfs as that is no
> > where near standard across all OS distros, and even if on tmpfs these
> > dirs can be size limited to a significant subset of available RAM. IOW
> > we can't safely use them unless explicitly told to.  
> 
> Agreed, mkstemp+shm_open seems better.  Perhaps this could be done in
> hostmem-memfd.c though, basically as a fallback option?  In principle
> one could even use getmntent to search for a hugetlbfs mount.

So far fall backs proved to be a pain to deal with, as end users can't
be sure what machine they are getting eventually.
I'd prefer if we fail cleanly if asked config isn't possible and
let user fix vm configuration instead.

> Paolo
> 
> 


Re: [PATCH 0/2] RFC: add -mem-shared option
Posted by Paolo Bonzini 4 years, 3 months ago
On 29/11/19 11:13, Igor Mammedov wrote:
>> Agreed, mkstemp+shm_open seems better.  Perhaps this could be done in
>> hostmem-memfd.c though, basically as a fallback option?  In principle
>> one could even use getmntent to search for a hugetlbfs mount.
> So far fall backs proved to be a pain to deal with, as end users can't
> be sure what machine they are getting eventually.
> I'd prefer if we fail cleanly if asked config isn't possible and
> let user fix vm configuration instead.
> 

As far as I know memfd vs. mktemp+shm_open+shm_unlink is pretty much the
same thing.  memfd provide additional features such as sealing, but
unless someone explicitly checks for memfd features, the two should look
the same.

Paolo


Re: [PATCH 0/2] RFC: add -mem-shared option
Posted by Eduardo Habkost 4 years, 3 months ago
On Thu, Nov 28, 2019 at 06:15:16PM +0400, Marc-André Lureau wrote:
> Hi,
> 
> Setting up shared memory for vhost-user is a bit complicated from
> command line, as it requires NUMA setup such as: m 4G -object
> memory-backend-file,id=mem,size=4G,mem-path=/dev/shm,share=on -numa
> node,memdev=mem.
> 
> Instead, I suggest to add a -mem-shared option for non-numa setups,
> that will make the -mem-path or anonymouse memory shareable.

Can we make this be a "-m" option?

Or, even better: can we make "-m" options be automatically
translated to memory-backend-* options somehow?

-- 
Eduardo


Re: [PATCH 0/2] RFC: add -mem-shared option
Posted by Igor Mammedov 4 years, 3 months ago
On Thu, 28 Nov 2019 13:10:21 -0300
Eduardo Habkost <ehabkost@redhat.com> wrote:

> On Thu, Nov 28, 2019 at 06:15:16PM +0400, Marc-André Lureau wrote:
> > Hi,
> > 
> > Setting up shared memory for vhost-user is a bit complicated from
> > command line, as it requires NUMA setup such as: m 4G -object
> > memory-backend-file,id=mem,size=4G,mem-path=/dev/shm,share=on -numa
> > node,memdev=mem.
> > 
> > Instead, I suggest to add a -mem-shared option for non-numa setups,
> > that will make the -mem-path or anonymouse memory shareable.  
> 
> Can we make this be a "-m" option?
> 
> Or, even better: can we make "-m" options be automatically
> translated to memory-backend-* options somehow?
-m SIZE will be translated to a one of memdev properties,
so new suboption potentially could be an aliased to
another memdev property.


Re: [PATCH 0/2] RFC: add -mem-shared option
Posted by Paolo Bonzini 4 years, 3 months ago
On 28/11/19 17:10, Eduardo Habkost wrote:
> On Thu, Nov 28, 2019 at 06:15:16PM +0400, Marc-André Lureau wrote:
>> Hi,
>>
>> Setting up shared memory for vhost-user is a bit complicated from
>> command line, as it requires NUMA setup such as: m 4G -object
>> memory-backend-file,id=mem,size=4G,mem-path=/dev/shm,share=on -numa
>> node,memdev=mem.
>>
>> Instead, I suggest to add a -mem-shared option for non-numa setups,
>> that will make the -mem-path or anonymouse memory shareable.
> 
> Can we make this be a "-m" option?
> 
> Or, even better: can we make "-m" options be automatically
> translated to memory-backend-* options somehow?
> 

The original idea was to always support one NUMA node, so that you could
do "-numa node,memdev=..." to specify a memory backend with -object.
However, this is not possible anymore since

    if (!mc->cpu_index_to_instance_props ||
        !mc->get_default_cpu_node_id) {
        error_setg(errp, "NUMA is not supported by this machine-type");
        return;
    }

has been added to hw/core/numa.c.

Therefore, I think instead of -mem-shared we should add a "-m
memdev=..." option.  This option:

* would be mutually exclusive with both -mem-path

* would be handled from allocate_system_memory_nonnuma.

* could be mutually exclusive "-numa node", or could just be mutually
exclusive with "-numa node,memdev=..." (the logical conclusion of that
however would be an undeprecation of "-numa node,mem=...", so that has
to be taken into account as well).

Paolo


Re: [PATCH 0/2] RFC: add -mem-shared option
Posted by Igor Mammedov 4 years, 3 months ago
On Fri, 29 Nov 2019 10:31:36 +0100
Paolo Bonzini <pbonzini@redhat.com> wrote:

> On 28/11/19 17:10, Eduardo Habkost wrote:
> > On Thu, Nov 28, 2019 at 06:15:16PM +0400, Marc-André Lureau wrote:  
> >> Hi,
> >>
> >> Setting up shared memory for vhost-user is a bit complicated from
> >> command line, as it requires NUMA setup such as: m 4G -object
> >> memory-backend-file,id=mem,size=4G,mem-path=/dev/shm,share=on -numa
> >> node,memdev=mem.
> >>
> >> Instead, I suggest to add a -mem-shared option for non-numa setups,
> >> that will make the -mem-path or anonymouse memory shareable.  
> > 
> > Can we make this be a "-m" option?
> > 
> > Or, even better: can we make "-m" options be automatically
> > translated to memory-backend-* options somehow?
> >   
[...]
> Therefore, I think instead of -mem-shared we should add a "-m
> memdev=..." option.  This option:
> 
> * would be mutually exclusive with both -mem-path
> 
> * would be handled from allocate_system_memory_nonnuma.
> 
> * could be mutually exclusive "-numa node", or could just be mutually
> exclusive with "-numa node,memdev=..." (the logical conclusion of that
> however would be an undeprecation of "-numa node,mem=...", so that has
> to be taken into account as well).

the plan was/still is to have memdev for main ram and deprecate
"-numa node,mem=..." for new machine types (can't be done for
old ones due to migration breakages and old libvirt which uses
it). So new machines will be forced to use "-numa memdev"
For old machines '-m memdev' provided memory region will used to
keep "-numa node,mem=..." working as it's now (however broken it is).


> 
> Paolo
> 
> 


Re: [PATCH 0/2] RFC: add -mem-shared option
Posted by Paolo Bonzini 4 years, 3 months ago
On 29/11/19 11:23, Igor Mammedov wrote:
> [...]
>> Therefore, I think instead of -mem-shared we should add a "-m
>> memdev=..." option.  This option:
>>
>> * would be mutually exclusive with both -mem-path
>>
>> * would be handled from allocate_system_memory_nonnuma.
>>
>> * could be mutually exclusive "-numa node", or could just be mutually
>> exclusive with "-numa node,memdev=..." (the logical conclusion of that
>> however would be an undeprecation of "-numa node,mem=...", so that has
>> to be taken into account as well).
> the plan was/still is to have memdev for main ram and deprecate
> "-numa node,mem=..." for new machine types (can't be done for
> old ones due to migration breakages and old libvirt which uses
> it). So new machines will be forced to use "-numa memdev"
> For old machines '-m memdev' provided memory region will used to
> keep "-numa node,mem=..." working as it's now (however broken it is).

Ok, then "-m memdev=..." would only be mutually exclusive with
"-mem-path" and "-numa node,memdev=...".

Thanks!

Paolo


Re: [PATCH 0/2] RFC: add -mem-shared option
Posted by Eduardo Habkost 4 years, 3 months ago
On Fri, Nov 29, 2019 at 10:31:36AM +0100, Paolo Bonzini wrote:
> On 28/11/19 17:10, Eduardo Habkost wrote:
> > On Thu, Nov 28, 2019 at 06:15:16PM +0400, Marc-André Lureau wrote:
> >> Hi,
> >>
> >> Setting up shared memory for vhost-user is a bit complicated from
> >> command line, as it requires NUMA setup such as: m 4G -object
> >> memory-backend-file,id=mem,size=4G,mem-path=/dev/shm,share=on -numa
> >> node,memdev=mem.
> >>
> >> Instead, I suggest to add a -mem-shared option for non-numa setups,
> >> that will make the -mem-path or anonymouse memory shareable.
> > 
> > Can we make this be a "-m" option?
> > 
> > Or, even better: can we make "-m" options be automatically
> > translated to memory-backend-* options somehow?
> > 
> 
> The original idea was to always support one NUMA node, so that you could
> do "-numa node,memdev=..." to specify a memory backend with -object.
> However, this is not possible anymore since
> 
>     if (!mc->cpu_index_to_instance_props ||
>         !mc->get_default_cpu_node_id) {
>         error_setg(errp, "NUMA is not supported by this machine-type");
>         return;
>     }
> 
> has been added to hw/core/numa.c.
> 
> Therefore, I think instead of -mem-shared we should add a "-m
> memdev=..." option.  This option:
> 
> * would be mutually exclusive with both -mem-path
> 
> * would be handled from allocate_system_memory_nonnuma.
> 
> * could be mutually exclusive "-numa node", or could just be mutually
> exclusive with "-numa node,memdev=..." (the logical conclusion of that
> however would be an undeprecation of "-numa node,mem=...", so that has
> to be taken into account as well).

I completely agree we could do this.  I just think this misses
completely the point of this series, because usability of:

  -object memory-backend-file,...,share=on,id=mem -m ...,memdev=mem

is not much better than the usability of:

  -object memory-backend-file,...,share=on,id=mem -numa node,memdev=mem

-- 
Eduardo


Re: [PATCH 0/2] RFC: add -mem-shared option
Posted by Marc-André Lureau 4 years, 3 months ago
Hi

On Sat, Nov 30, 2019 at 12:23 AM Eduardo Habkost <ehabkost@redhat.com> wrote:
>
> On Fri, Nov 29, 2019 at 10:31:36AM +0100, Paolo Bonzini wrote:
> > On 28/11/19 17:10, Eduardo Habkost wrote:
> > > On Thu, Nov 28, 2019 at 06:15:16PM +0400, Marc-André Lureau wrote:
> > >> Hi,
> > >>
> > >> Setting up shared memory for vhost-user is a bit complicated from
> > >> command line, as it requires NUMA setup such as: m 4G -object
> > >> memory-backend-file,id=mem,size=4G,mem-path=/dev/shm,share=on -numa
> > >> node,memdev=mem.
> > >>
> > >> Instead, I suggest to add a -mem-shared option for non-numa setups,
> > >> that will make the -mem-path or anonymouse memory shareable.
> > >
> > > Can we make this be a "-m" option?
> > >
> > > Or, even better: can we make "-m" options be automatically
> > > translated to memory-backend-* options somehow?
> > >
> >
> > The original idea was to always support one NUMA node, so that you could
> > do "-numa node,memdev=..." to specify a memory backend with -object.
> > However, this is not possible anymore since
> >
> >     if (!mc->cpu_index_to_instance_props ||
> >         !mc->get_default_cpu_node_id) {
> >         error_setg(errp, "NUMA is not supported by this machine-type");
> >         return;
> >     }
> >
> > has been added to hw/core/numa.c.
> >
> > Therefore, I think instead of -mem-shared we should add a "-m
> > memdev=..." option.  This option:
> >
> > * would be mutually exclusive with both -mem-path
> >
> > * would be handled from allocate_system_memory_nonnuma.
> >
> > * could be mutually exclusive "-numa node", or could just be mutually
> > exclusive with "-numa node,memdev=..." (the logical conclusion of that
> > however would be an undeprecation of "-numa node,mem=...", so that has
> > to be taken into account as well).
>
> I completely agree we could do this.  I just think this misses
> completely the point of this series, because usability of:
>
>   -object memory-backend-file,...,share=on,id=mem -m ...,memdev=mem
>
> is not much better than the usability of:
>
>   -object memory-backend-file,...,share=on,id=mem -numa node,memdev=mem
>

+1
Perhaps when all RAM allocation will occur through memory-backend,
"-mem-shared" could be simply an alias to "-global
memory-backend.shared=on"


-- 
Marc-André Lureau

Re: [PATCH 0/2] RFC: add -mem-shared option
Posted by Paolo Bonzini 4 years, 3 months ago
On 01/12/19 16:40, Marc-André Lureau wrote:
>>> The original idea was to always support one NUMA node, so that you could
>>> do "-numa node,memdev=..." to specify a memory backend with -object.
>>> However, this is not possible anymore since
>>>
>>>     if (!mc->cpu_index_to_instance_props ||
>>>         !mc->get_default_cpu_node_id) {
>>>         error_setg(errp, "NUMA is not supported by this machine-type");
>>>         return;
>>>     }
>>>
>>> has been added to hw/core/numa.c.
>>>
>>> Therefore, I think instead of -mem-shared we should add a "-m
>>> memdev=..." option.  This option:
>>>
>>> * would be mutually exclusive with both -mem-path
>>>
>>> * would be handled from allocate_system_memory_nonnuma.
>>>
>>> * could be mutually exclusive "-numa node", or could just be mutually
>>> exclusive with "-numa node,memdev=..." (the logical conclusion of that
>>> however would be an undeprecation of "-numa node,mem=...", so that has
>>> to be taken into account as well).
>> I completely agree we could do this.  I just think this misses
>> completely the point of this series, because usability of:
>>
>>   -object memory-backend-file,...,share=on,id=mem -m ...,memdev=mem
>>
>> is not much better than the usability of:
>>
>>   -object memory-backend-file,...,share=on,id=mem -numa node,memdev=mem
>>
> +1
> Perhaps when all RAM allocation will occur through memory-backend,
> "-mem-shared" could be simply an alias to "-global
> memory-backend.shared=on"

Yes, this is the point.  There are two parts in this series:

(1) allowing use of vhost-user on non-NUMA machines

(2) providing syntactic sugar for it

I have no problem with -mem-shared for (2), but it should just be
syntactic sugar for (1).  It's okay if -mem-shared is a global variable
rather than an alias for -global; the important part is not to add any
feature that is not available from the QOM-style command line options.

Paolo


Re: [PATCH 0/2] RFC: add -mem-shared option
Posted by no-reply@patchew.org 4 years, 3 months ago
Patchew URL: https://patchew.org/QEMU/20191128141518.628245-1-marcandre.lureau@redhat.com/



Hi,

This series failed the docker-quick@centos7 build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
make docker-image-centos7 V=1 NETWORK=1
time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
=== TEST SCRIPT END ===

  CC      x86_64-softmmu/accel/tcg/tcg-runtime-gvec.o
  CC      x86_64-softmmu/accel/tcg/cpu-exec.o
/tmp/qemu-test/src/exec.c: In function 'qemu_ram_alloc_from_file':
/tmp/qemu-test/src/exec.c:2366:12: error: 'created' may be used uninitialized in this function [-Werror=maybe-uninitialized]
         if (created) {
            ^
cc1: all warnings being treated as errors
make[1]: *** [exec.o] Error 1
make[1]: *** Waiting for unfinished jobs....
  CC      aarch64-softmmu/disas.o
  GEN     aarch64-softmmu/gdbstub-xml.c
/tmp/qemu-test/src/exec.c: In function 'qemu_ram_alloc_from_file':
/tmp/qemu-test/src/exec.c:2366:12: error: 'created' may be used uninitialized in this function [-Werror=maybe-uninitialized]
         if (created) {
            ^
cc1: all warnings being treated as errors
make[1]: *** [exec.o] Error 1
make[1]: *** Waiting for unfinished jobs....
make: *** [x86_64-softmmu/all] Error 2
make: *** Waiting for unfinished jobs....
make: *** [aarch64-softmmu/all] Error 2
Traceback (most recent call last):
  File "./tests/docker/docker.py", line 662, in <module>
    sys.exit(main())
---
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--label', 'com.qemu.instance.uuid=ce348942cc144218ab665262b5e2bf34', '-u', '1003', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=1', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew2/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-fts2mukp/src/docker-src.2019-11-28-23.35.08.15514:/var/tmp/qemu:z,ro', 'qemu:centos7', '/var/tmp/qemu/run', 'test-quick']' returned non-zero exit status 2.
filter=--filter=label=com.qemu.instance.uuid=ce348942cc144218ab665262b5e2bf34
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-fts2mukp/src'
make: *** [docker-run-test-quick@centos7] Error 2

real    2m27.615s
user    0m8.146s


The full log is available at
http://patchew.org/logs/20191128141518.628245-1-marcandre.lureau@redhat.com/testing.docker-quick@centos7/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com