[libvirt PATCH v2 00/16] Use nbdkit for http/ftp/ssh network drives in libvirt

Jonathon Jongsma posted 16 patches 1 year, 8 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/libvirt tags/patchew/20220831184101.716362-1-jjongsma@redhat.com
There is a newer version of this series
build-aux/syntax-check.mk                     |    4 +-
docs/formatdomain.rst                         |    2 +-
meson.build                                   |    6 +
meson_options.txt                             |    1 +
po/POTFILES                                   |    1 +
src/conf/schemas/domaincommon.rng             |    1 +
src/qemu/meson.build                          |    1 +
src/qemu/qemu_block.c                         |  168 ++-
src/qemu/qemu_command.c                       |    4 +-
src/qemu/qemu_conf.c                          |   22 +
src/qemu/qemu_conf.h                          |    6 +
src/qemu/qemu_domain.c                        |  176 ++-
src/qemu/qemu_domain.h                        |    4 +
src/qemu/qemu_driver.c                        |    3 +
src/qemu/qemu_extdevice.c                     |   84 ++
src/qemu/qemu_nbdkit.c                        | 1051 +++++++++++++++++
src/qemu/qemu_nbdkit.h                        |   90 ++
src/qemu/qemu_nbdkitpriv.h                    |   46 +
src/util/virfilecache.c                       |   15 +-
src/util/virfilecache.h                       |    2 +-
src/util/virutil.h                            |    2 +-
tests/meson.build                             |    1 +
.../disk-cdrom-network.args.disk0             |    7 +
.../disk-cdrom-network.args.disk1             |    9 +
.../disk-cdrom-network.args.disk1.pipe.45     |    1 +
.../disk-cdrom-network.args.disk2             |    9 +
.../disk-cdrom-network.args.disk2.pipe.47     |    1 +
.../disk-network-http.args.disk0              |    7 +
.../disk-network-http.args.disk1              |    6 +
.../disk-network-http.args.disk2              |    7 +
.../disk-network-http.args.disk2.pipe.45      |    1 +
.../disk-network-http.args.disk3              |    8 +
.../disk-network-http.args.disk3.pipe.47      |    1 +
...work-source-curl-nbdkit-backing.args.disk0 |    8 +
...rce-curl-nbdkit-backing.args.disk0.pipe.45 |    1 +
.../disk-network-source-curl.args.1.pipe.1    |    1 +
.../disk-network-source-curl.args.disk0       |    8 +
...isk-network-source-curl.args.disk0.pipe.45 |    1 +
.../disk-network-source-curl.args.disk1       |   10 +
...isk-network-source-curl.args.disk1.pipe.47 |    1 +
...isk-network-source-curl.args.disk1.pipe.49 |    1 +
.../disk-network-source-curl.args.disk2       |    8 +
...isk-network-source-curl.args.disk2.pipe.49 |    1 +
...isk-network-source-curl.args.disk2.pipe.51 |    1 +
.../disk-network-source-curl.args.disk3       |    7 +
.../disk-network-source-curl.args.disk4       |    7 +
.../disk-network-ssh.args.disk0               |    7 +
tests/qemunbdkittest.c                        |  271 +++++
...sk-cdrom-network-nbdkit.x86_64-latest.args |   42 +
.../disk-cdrom-network-nbdkit.xml             |    1 +
...isk-network-http-nbdkit.x86_64-latest.args |   45 +
.../disk-network-http-nbdkit.xml              |    1 +
...rce-curl-nbdkit-backing.x86_64-latest.args |   38 +
...isk-network-source-curl-nbdkit-backing.xml |   45 +
...work-source-curl-nbdkit.x86_64-latest.args |   50 +
.../disk-network-source-curl-nbdkit.xml       |    1 +
...isk-network-source-curl.x86_64-latest.args |   54 +
.../disk-network-source-curl.xml              |   74 ++
...disk-network-ssh-nbdkit.x86_64-latest.args |   36 +
.../disk-network-ssh-nbdkit.xml               |    1 +
.../disk-network-ssh.x86_64-latest.args       |   36 +
tests/qemuxml2argvdata/disk-network-ssh.xml   |   31 +
tests/qemuxml2argvtest.c                      |   18 +
tests/testutilsqemu.c                         |   27 +
tests/testutilsqemu.h                         |    5 +
65 files changed, 2474 insertions(+), 111 deletions(-)
create mode 100644 src/qemu/qemu_nbdkit.c
create mode 100644 src/qemu/qemu_nbdkit.h
create mode 100644 src/qemu/qemu_nbdkitpriv.h
create mode 100644 tests/qemunbdkitdata/disk-cdrom-network.args.disk0
create mode 100644 tests/qemunbdkitdata/disk-cdrom-network.args.disk1
create mode 100644 tests/qemunbdkitdata/disk-cdrom-network.args.disk1.pipe.45
create mode 100644 tests/qemunbdkitdata/disk-cdrom-network.args.disk2
create mode 100644 tests/qemunbdkitdata/disk-cdrom-network.args.disk2.pipe.47
create mode 100644 tests/qemunbdkitdata/disk-network-http.args.disk0
create mode 100644 tests/qemunbdkitdata/disk-network-http.args.disk1
create mode 100644 tests/qemunbdkitdata/disk-network-http.args.disk2
create mode 100644 tests/qemunbdkitdata/disk-network-http.args.disk2.pipe.45
create mode 100644 tests/qemunbdkitdata/disk-network-http.args.disk3
create mode 100644 tests/qemunbdkitdata/disk-network-http.args.disk3.pipe.47
create mode 100644 tests/qemunbdkitdata/disk-network-source-curl-nbdkit-backing.args.disk0
create mode 100644 tests/qemunbdkitdata/disk-network-source-curl-nbdkit-backing.args.disk0.pipe.45
create mode 100644 tests/qemunbdkitdata/disk-network-source-curl.args.1.pipe.1
create mode 100644 tests/qemunbdkitdata/disk-network-source-curl.args.disk0
create mode 100644 tests/qemunbdkitdata/disk-network-source-curl.args.disk0.pipe.45
create mode 100644 tests/qemunbdkitdata/disk-network-source-curl.args.disk1
create mode 100644 tests/qemunbdkitdata/disk-network-source-curl.args.disk1.pipe.47
create mode 100644 tests/qemunbdkitdata/disk-network-source-curl.args.disk1.pipe.49
create mode 100644 tests/qemunbdkitdata/disk-network-source-curl.args.disk2
create mode 100644 tests/qemunbdkitdata/disk-network-source-curl.args.disk2.pipe.49
create mode 100644 tests/qemunbdkitdata/disk-network-source-curl.args.disk2.pipe.51
create mode 100644 tests/qemunbdkitdata/disk-network-source-curl.args.disk3
create mode 100644 tests/qemunbdkitdata/disk-network-source-curl.args.disk4
create mode 100644 tests/qemunbdkitdata/disk-network-ssh.args.disk0
create mode 100644 tests/qemunbdkittest.c
create mode 100644 tests/qemuxml2argvdata/disk-cdrom-network-nbdkit.x86_64-latest.args
create mode 120000 tests/qemuxml2argvdata/disk-cdrom-network-nbdkit.xml
create mode 100644 tests/qemuxml2argvdata/disk-network-http-nbdkit.x86_64-latest.args
create mode 120000 tests/qemuxml2argvdata/disk-network-http-nbdkit.xml
create mode 100644 tests/qemuxml2argvdata/disk-network-source-curl-nbdkit-backing.x86_64-latest.args
create mode 100644 tests/qemuxml2argvdata/disk-network-source-curl-nbdkit-backing.xml
create mode 100644 tests/qemuxml2argvdata/disk-network-source-curl-nbdkit.x86_64-latest.args
create mode 120000 tests/qemuxml2argvdata/disk-network-source-curl-nbdkit.xml
create mode 100644 tests/qemuxml2argvdata/disk-network-source-curl.x86_64-latest.args
create mode 100644 tests/qemuxml2argvdata/disk-network-source-curl.xml
create mode 100644 tests/qemuxml2argvdata/disk-network-ssh-nbdkit.x86_64-latest.args
create mode 120000 tests/qemuxml2argvdata/disk-network-ssh-nbdkit.xml
create mode 100644 tests/qemuxml2argvdata/disk-network-ssh.x86_64-latest.args
create mode 100644 tests/qemuxml2argvdata/disk-network-ssh.xml
[libvirt PATCH v2 00/16] Use nbdkit for http/ftp/ssh network drives in libvirt
Posted by Jonathon Jongsma 1 year, 8 months ago
After a bit of a lengthy delay, this is the second version of this patch
series. See https://bugzilla.redhat.com/show_bug.cgi?id=2016527 for more
information about the goal, but the summary is that RHEL does not want to ship
the qemu storage plugins for curl and ssh.  Handling them outside of the qemu
process provides several advantages such as reduced attack surface and
stability.

A quick summary of the code:
 - at startup I query to see whether nbdkit exists on the host and if
   so, I query which plugins/filters are installed. These capabilities
   are cached and stored in the qemu driver
 - When the driver prepares the domain, we go through each disk source
   and determine whether the nbdkit capabilities allow us to support
   this disk via nbdkit, and if so, we allocate a qemuNbdkitProcess
   object and stash it in the private data of the virStorageSource.
 - The presence or absence of this qemuNbdkitProcess data then indicates
   whether this disk will be served to qemu indirectly via nbdkit or
   directly
 - When we launch the qemuProcess, as part of the "external device
   start" step, I launch a ndkit process for each disk that is supported
   by nbdkit.
 - for devices which are served by an intermediate ndkit process, I
   change the qemu commandline in the following ways:
   - I no longer pass auth/cookie secrets to qemu (those are handled by
     nbdkit)
   - I replace the actual network URL of the remote disk source with the
     path to the nbdkit unix socket

Open questions
 - selinux: I need some help from people more familiar with selinux to figure
   out what is needed here. When selinux is enforcing, I get a failure to
   launch nbdkit to serve the disks. I suspect we need a new context and policy
   for /usr/sbin/nbdkit that allows it to transition to the appropriate selinux
   context. The current context (on fedora) is "system_u:object_r:bin_t:s0".
   When I (temporarily) change the context to something like qemu_exec_t,
   I am able to start nbdkit and the domain launches.

Known shortcomings
 - creating disks (in ssh) still isn't supported. I wanted to send out the
   patch series anyway since it's been delayed too long already.

Changes since v1:
 - split into multiple patches
 - added a build option for nbdkit_moddir
 - don't instantiate any secret / cookie props for disks that are being served
   by nbdkit since we don't send secrets to qemu anymore
 - ensure that nbdkit processes are started/stopped for the entire backing
   chain
 - switch to virFileCache-based capabilities for nbdkit so that we don't need
   to requery every time
 - switch to using pipes for communicating sensitive data to nbdkit
 - use pidfile support built into virCommand rather than nbdkit's --pidfile
   argument
 - added significantly more tests

Jonathon Jongsma (16):
  schema: allow 'ssh' as a protocol for network disks
  qemu: Add qemuNbdkitCaps to qemu driver
  qemu: expand nbdkit capabilities
  util: Allow virFileCache data to be any GObject
  qemu: implement basic virFileCache for nbdkit caps
  qemu: implement persistent file cache for nbdkit caps
  qemu: use file cache for nbdkit caps
  qemu: Add qemuNbdkitProcess
  qemu: add functions to start and stop nbdkit
  tests: add ability to test various nbdkit capabilities
  qemu: split qemuDomainSecretStorageSourcePrepare
  qemu: use nbdkit to serve network disks if available
  qemu: include nbdkit state in private xml
  tests: add tests for nbdkit invocation
  qemu: pass sensitive data to nbdkit via pipe
  qemu: add test for authenticating a https network disk

 build-aux/syntax-check.mk                     |    4 +-
 docs/formatdomain.rst                         |    2 +-
 meson.build                                   |    6 +
 meson_options.txt                             |    1 +
 po/POTFILES                                   |    1 +
 src/conf/schemas/domaincommon.rng             |    1 +
 src/qemu/meson.build                          |    1 +
 src/qemu/qemu_block.c                         |  168 ++-
 src/qemu/qemu_command.c                       |    4 +-
 src/qemu/qemu_conf.c                          |   22 +
 src/qemu/qemu_conf.h                          |    6 +
 src/qemu/qemu_domain.c                        |  176 ++-
 src/qemu/qemu_domain.h                        |    4 +
 src/qemu/qemu_driver.c                        |    3 +
 src/qemu/qemu_extdevice.c                     |   84 ++
 src/qemu/qemu_nbdkit.c                        | 1051 +++++++++++++++++
 src/qemu/qemu_nbdkit.h                        |   90 ++
 src/qemu/qemu_nbdkitpriv.h                    |   46 +
 src/util/virfilecache.c                       |   15 +-
 src/util/virfilecache.h                       |    2 +-
 src/util/virutil.h                            |    2 +-
 tests/meson.build                             |    1 +
 .../disk-cdrom-network.args.disk0             |    7 +
 .../disk-cdrom-network.args.disk1             |    9 +
 .../disk-cdrom-network.args.disk1.pipe.45     |    1 +
 .../disk-cdrom-network.args.disk2             |    9 +
 .../disk-cdrom-network.args.disk2.pipe.47     |    1 +
 .../disk-network-http.args.disk0              |    7 +
 .../disk-network-http.args.disk1              |    6 +
 .../disk-network-http.args.disk2              |    7 +
 .../disk-network-http.args.disk2.pipe.45      |    1 +
 .../disk-network-http.args.disk3              |    8 +
 .../disk-network-http.args.disk3.pipe.47      |    1 +
 ...work-source-curl-nbdkit-backing.args.disk0 |    8 +
 ...rce-curl-nbdkit-backing.args.disk0.pipe.45 |    1 +
 .../disk-network-source-curl.args.1.pipe.1    |    1 +
 .../disk-network-source-curl.args.disk0       |    8 +
 ...isk-network-source-curl.args.disk0.pipe.45 |    1 +
 .../disk-network-source-curl.args.disk1       |   10 +
 ...isk-network-source-curl.args.disk1.pipe.47 |    1 +
 ...isk-network-source-curl.args.disk1.pipe.49 |    1 +
 .../disk-network-source-curl.args.disk2       |    8 +
 ...isk-network-source-curl.args.disk2.pipe.49 |    1 +
 ...isk-network-source-curl.args.disk2.pipe.51 |    1 +
 .../disk-network-source-curl.args.disk3       |    7 +
 .../disk-network-source-curl.args.disk4       |    7 +
 .../disk-network-ssh.args.disk0               |    7 +
 tests/qemunbdkittest.c                        |  271 +++++
 ...sk-cdrom-network-nbdkit.x86_64-latest.args |   42 +
 .../disk-cdrom-network-nbdkit.xml             |    1 +
 ...isk-network-http-nbdkit.x86_64-latest.args |   45 +
 .../disk-network-http-nbdkit.xml              |    1 +
 ...rce-curl-nbdkit-backing.x86_64-latest.args |   38 +
 ...isk-network-source-curl-nbdkit-backing.xml |   45 +
 ...work-source-curl-nbdkit.x86_64-latest.args |   50 +
 .../disk-network-source-curl-nbdkit.xml       |    1 +
 ...isk-network-source-curl.x86_64-latest.args |   54 +
 .../disk-network-source-curl.xml              |   74 ++
 ...disk-network-ssh-nbdkit.x86_64-latest.args |   36 +
 .../disk-network-ssh-nbdkit.xml               |    1 +
 .../disk-network-ssh.x86_64-latest.args       |   36 +
 tests/qemuxml2argvdata/disk-network-ssh.xml   |   31 +
 tests/qemuxml2argvtest.c                      |   18 +
 tests/testutilsqemu.c                         |   27 +
 tests/testutilsqemu.h                         |    5 +
 65 files changed, 2474 insertions(+), 111 deletions(-)
 create mode 100644 src/qemu/qemu_nbdkit.c
 create mode 100644 src/qemu/qemu_nbdkit.h
 create mode 100644 src/qemu/qemu_nbdkitpriv.h
 create mode 100644 tests/qemunbdkitdata/disk-cdrom-network.args.disk0
 create mode 100644 tests/qemunbdkitdata/disk-cdrom-network.args.disk1
 create mode 100644 tests/qemunbdkitdata/disk-cdrom-network.args.disk1.pipe.45
 create mode 100644 tests/qemunbdkitdata/disk-cdrom-network.args.disk2
 create mode 100644 tests/qemunbdkitdata/disk-cdrom-network.args.disk2.pipe.47
 create mode 100644 tests/qemunbdkitdata/disk-network-http.args.disk0
 create mode 100644 tests/qemunbdkitdata/disk-network-http.args.disk1
 create mode 100644 tests/qemunbdkitdata/disk-network-http.args.disk2
 create mode 100644 tests/qemunbdkitdata/disk-network-http.args.disk2.pipe.45
 create mode 100644 tests/qemunbdkitdata/disk-network-http.args.disk3
 create mode 100644 tests/qemunbdkitdata/disk-network-http.args.disk3.pipe.47
 create mode 100644 tests/qemunbdkitdata/disk-network-source-curl-nbdkit-backing.args.disk0
 create mode 100644 tests/qemunbdkitdata/disk-network-source-curl-nbdkit-backing.args.disk0.pipe.45
 create mode 100644 tests/qemunbdkitdata/disk-network-source-curl.args.1.pipe.1
 create mode 100644 tests/qemunbdkitdata/disk-network-source-curl.args.disk0
 create mode 100644 tests/qemunbdkitdata/disk-network-source-curl.args.disk0.pipe.45
 create mode 100644 tests/qemunbdkitdata/disk-network-source-curl.args.disk1
 create mode 100644 tests/qemunbdkitdata/disk-network-source-curl.args.disk1.pipe.47
 create mode 100644 tests/qemunbdkitdata/disk-network-source-curl.args.disk1.pipe.49
 create mode 100644 tests/qemunbdkitdata/disk-network-source-curl.args.disk2
 create mode 100644 tests/qemunbdkitdata/disk-network-source-curl.args.disk2.pipe.49
 create mode 100644 tests/qemunbdkitdata/disk-network-source-curl.args.disk2.pipe.51
 create mode 100644 tests/qemunbdkitdata/disk-network-source-curl.args.disk3
 create mode 100644 tests/qemunbdkitdata/disk-network-source-curl.args.disk4
 create mode 100644 tests/qemunbdkitdata/disk-network-ssh.args.disk0
 create mode 100644 tests/qemunbdkittest.c
 create mode 100644 tests/qemuxml2argvdata/disk-cdrom-network-nbdkit.x86_64-latest.args
 create mode 120000 tests/qemuxml2argvdata/disk-cdrom-network-nbdkit.xml
 create mode 100644 tests/qemuxml2argvdata/disk-network-http-nbdkit.x86_64-latest.args
 create mode 120000 tests/qemuxml2argvdata/disk-network-http-nbdkit.xml
 create mode 100644 tests/qemuxml2argvdata/disk-network-source-curl-nbdkit-backing.x86_64-latest.args
 create mode 100644 tests/qemuxml2argvdata/disk-network-source-curl-nbdkit-backing.xml
 create mode 100644 tests/qemuxml2argvdata/disk-network-source-curl-nbdkit.x86_64-latest.args
 create mode 120000 tests/qemuxml2argvdata/disk-network-source-curl-nbdkit.xml
 create mode 100644 tests/qemuxml2argvdata/disk-network-source-curl.x86_64-latest.args
 create mode 100644 tests/qemuxml2argvdata/disk-network-source-curl.xml
 create mode 100644 tests/qemuxml2argvdata/disk-network-ssh-nbdkit.x86_64-latest.args
 create mode 120000 tests/qemuxml2argvdata/disk-network-ssh-nbdkit.xml
 create mode 100644 tests/qemuxml2argvdata/disk-network-ssh.x86_64-latest.args
 create mode 100644 tests/qemuxml2argvdata/disk-network-ssh.xml

-- 
2.37.1
Re: [libvirt PATCH v2 00/16] Use nbdkit for http/ftp/ssh network drives in libvirt
Posted by Peter Krempa 1 year, 7 months ago
On Wed, Aug 31, 2022 at 13:40:45 -0500, Jonathon Jongsma wrote:
> After a bit of a lengthy delay, this is the second version of this patch
> series. See https://bugzilla.redhat.com/show_bug.cgi?id=2016527 for more
> information about the goal, but the summary is that RHEL does not want to ship
> the qemu storage plugins for curl and ssh.  Handling them outside of the qemu
> process provides several advantages such as reduced attack surface and
> stability.

IMO it's also worthy noting that it increases complexity of the setup
and potentially also resource usage.

> A quick summary of the code:

[...]

> Open questions
>  - selinux: I need some help from people more familiar with selinux to figure
>    out what is needed here. When selinux is enforcing, I get a failure to
>    launch nbdkit to serve the disks. I suspect we need a new context and policy
>    for /usr/sbin/nbdkit that allows it to transition to the appropriate selinux
>    context. The current context (on fedora) is "system_u:object_r:bin_t:s0".
>    When I (temporarily) change the context to something like qemu_exec_t,
>    I am able to start nbdkit and the domain launches.

Is the problem with starting the 'nbdkit' process itself or with the
socket?

At least in case of the socket we must make sure that no other process
can acess it especially once you pass authentication to nbdkit to avoid
any kind of backdoor to authenticated storage.

Few more open questions:
 - What if 'nbdkit' crashes
    With an integrated block layer, all of the VM crashes. Now when we
    have a separated access to disks (this is also an issue for use of
    the qemu-storage-daemon) if any of the helper processes crash we get
    into a new situation.

    I think we'll need to think about:
        - adding an event for any of the helper processes failing
        - adding a lifecycle action for it (e.g. pause qemu if nbdkit
          dies)
        - think about possible recovery of the situation

 - resource pinning
   For now all the resources are integral to the qemu process so
   emulator and iothread pinning can be used to steer which cpus the
   disk should use. With us adding new possibly cpu intensive processes
   we'll probably need to consider how to handle them more generally and
   manage their resources.

 - Integration with qemu storage daemon used for a VM

   With the attempt to rewrite QSD in other languages it will possibly
   make sense to run it instead of the native qemu block layer (e.g.
   take advantage of memory safe languages). So we should also think
   about how these two will be able to coexist.

The last point is more of a future-work thing to consider, but the
first two points should be considered for the final release of this
feature. Specifically because in your current design it replaces the
in-qemu driver even in cases when it is compiled into qemu thus also for
existing users. Alternatively it would have to be opt-in.

> Known shortcomings
>  - creating disks (in ssh) still isn't supported. I wanted to send out the
>    patch series anyway since it's been delayed too long already.

That shouldn't be a problem, there's plenty protocols where we don't
support creating the storage. Creating storage is needed only for
snapshots so we can simply refuse to do it in the first place.
Re: [libvirt PATCH v2 00/16] Use nbdkit for http/ftp/ssh network drives in libvirt
Posted by Jonathon Jongsma 1 year, 7 months ago
On 9/19/22 8:48 AM, Peter Krempa wrote:
> On Wed, Aug 31, 2022 at 13:40:45 -0500, Jonathon Jongsma wrote:
>> After a bit of a lengthy delay, this is the second version of this patch
>> series. See https://bugzilla.redhat.com/show_bug.cgi?id=2016527 for more
>> information about the goal, but the summary is that RHEL does not want to ship
>> the qemu storage plugins for curl and ssh.  Handling them outside of the qemu
>> process provides several advantages such as reduced attack surface and
>> stability.
> 
> IMO it's also worthy noting that it increases complexity of the setup
> and potentially also resource usage.
> 
>> A quick summary of the code:
> 
> [...]
> 
>> Open questions
>>   - selinux: I need some help from people more familiar with selinux to figure
>>     out what is needed here. When selinux is enforcing, I get a failure to
>>     launch nbdkit to serve the disks. I suspect we need a new context and policy
>>     for /usr/sbin/nbdkit that allows it to transition to the appropriate selinux
>>     context. The current context (on fedora) is "system_u:object_r:bin_t:s0".
>>     When I (temporarily) change the context to something like qemu_exec_t,
>>     I am able to start nbdkit and the domain launches.
> 
> Is the problem with starting the 'nbdkit' process itself or with the
> socket?

The problem seems to be with the nbdkit process itself. Because I use 
qemuSecurityCommandRun() to launch nbdkit, it sets the security label 
for the nbkit process to be the security label for the domain and then 
attempts to launch it. Presumably there is no rule allowing the binaries 
with the bin_t context to transition to the label for the domain. But as 
I said, my selinux knowledge is quite shallow, so I may need some help 
figuring out the proper way to do this.


> At least in case of the socket we must make sure that no other process
> can acess it especially once you pass authentication to nbdkit to avoid
> any kind of backdoor to authenticated storage.

nbdkit does have a --selinux-label argument which will set the label of 
the socket.

> 
> Few more open questions:
>   - What if 'nbdkit' crashes
>      With an integrated block layer, all of the VM crashes. Now when we
>      have a separated access to disks (this is also an issue for use of
>      the qemu-storage-daemon) if any of the helper processes crash we get
>      into a new situation.
> 
>      I think we'll need to think about:
>          - adding an event for any of the helper processes failing
>          - adding a lifecycle action for it (e.g. pause qemu if nbdkit
>            dies)
>          - think about possible recovery of the situation

Indeed. Seems like something re-usable that could also be used for 
qemu-storage-daemon might be useful. I'll look into it.

> 
>   - resource pinning
>     For now all the resources are integral to the qemu process so
>     emulator and iothread pinning can be used to steer which cpus the
>     disk should use. With us adding new possibly cpu intensive processes
>     we'll probably need to consider how to handle them more generally and
>     manage their resources.

Good point. I hadn't really thought about this.

> 
>   - Integration with qemu storage daemon used for a VM
> 
>     With the attempt to rewrite QSD in other languages it will possibly
>     make sense to run it instead of the native qemu block layer (e.g.
>     take advantage of memory safe languages). So we should also think
>     about how these two will be able to coexist.
> 
> The last point is more of a future-work thing to consider, but the
> first two points should be considered for the final release of this
> feature. Specifically because in your current design it replaces the
> in-qemu driver even in cases when it is compiled into qemu thus also for
> existing users. Alternatively it would have to be opt-in.

As far as I can tell, this is currently no good way to determine (via 
capabilities or otherwise) whether e.g. the curl driver is compiled into 
a particular qemu binary. If I've missed something, please let me know. 
My recollection of previous discussions was that I should try to use 
nbdkit if available and then fall back to trying the built-in qemu 
driver approach. But we could change this strategy if there's a good way 
to do it. When you talk about opt-in, do you mean per-domain? e.g. in 
the xml? or some other mechanism?

> 
>> Known shortcomings
>>   - creating disks (in ssh) still isn't supported. I wanted to send out the
>>     patch series anyway since it's been delayed too long already.
> 
> That shouldn't be a problem, there's plenty protocols where we don't
> support creating the storage. Creating storage is needed only for
> snapshots so we can simply refuse to do it in the first place.
>
Re: [libvirt PATCH v2 00/16] Use nbdkit for http/ftp/ssh network drives in libvirt
Posted by Jonathon Jongsma 1 year, 7 months ago
On 9/21/22 1:44 PM, Jonathon Jongsma wrote:
> On 9/19/22 8:48 AM, Peter Krempa wrote:
>> On Wed, Aug 31, 2022 at 13:40:45 -0500, Jonathon Jongsma wrote:
>>> After a bit of a lengthy delay, this is the second version of this patch
>>> series. See https://bugzilla.redhat.com/show_bug.cgi?id=2016527 for more
>>> information about the goal, but the summary is that RHEL does not 
>>> want to ship
>>> the qemu storage plugins for curl and ssh.  Handling them outside of 
>>> the qemu
>>> process provides several advantages such as reduced attack surface and
>>> stability.
>>
>> IMO it's also worthy noting that it increases complexity of the setup
>> and potentially also resource usage.
>>
>>> A quick summary of the code:
>>
>> [...]
>>
>>> Open questions
>>>   - selinux: I need some help from people more familiar with selinux 
>>> to figure
>>>     out what is needed here. When selinux is enforcing, I get a 
>>> failure to
>>>     launch nbdkit to serve the disks. I suspect we need a new context 
>>> and policy
>>>     for /usr/sbin/nbdkit that allows it to transition to the 
>>> appropriate selinux
>>>     context. The current context (on fedora) is 
>>> "system_u:object_r:bin_t:s0".
>>>     When I (temporarily) change the context to something like 
>>> qemu_exec_t,
>>>     I am able to start nbdkit and the domain launches.
>>
>> Is the problem with starting the 'nbdkit' process itself or with the
>> socket?
> 
> The problem seems to be with the nbdkit process itself. Because I use 
> qemuSecurityCommandRun() to launch nbdkit, it sets the security label 
> for the nbkit process to be the security label for the domain and then 
> attempts to launch it. Presumably there is no rule allowing the binaries 
> with the bin_t context to transition to the label for the domain. But as 
> I said, my selinux knowledge is quite shallow, so I may need some help 
> figuring out the proper way to do this.

Just for completeness, here's some additional information from 
SETroubleshoot when trying to start a domain using nbdkit with my 
patches. In this case I'm just executing the monolithic libvirtd daemon 
from my working directory for testing. The domain fails to start :

SELinux is preventing rpc-libvirtd from entrypoint access on the file 
/usr/sbin/nbdkit.

Additional Information:
Source Context                unconfined_u:unconfined_r:svirt_t:s0:c129,c164
Target Context                system_u:object_r:bin_t:s0
Target Objects                /usr/sbin/nbdkit [ file ]
Source                        rpc-libvirtd
Source Path                   rpc-libvirtd
Port                          <Unknown>

Raw Audit Messages
type=AVC msg=audit(1663876079.221:7519): avc:  denied  { entrypoint } 
for  pid=1750906 comm="rpc-libvirtd" path="/usr/sbin/nbdkit" dev="dm-1" 
ino=3160232 scontext=unconfined_u:unconfined_r:svirt_t:s0:c129,c164 
tcontext=system_u:object_r:bin_t:s0 tclass=file permissive=0


Then if I temporarily do something like:

# chcon -t qemu_exec_t /usr/sbin/nbdkit

nbdkit (and the domain) starts fine



>> At least in case of the socket we must make sure that no other process
>> can acess it especially once you pass authentication to nbdkit to avoid
>> any kind of backdoor to authenticated storage.
> 
> nbdkit does have a --selinux-label argument which will set the label of 
> the socket.
> 
>>
>> Few more open questions:
>>   - What if 'nbdkit' crashes
>>      With an integrated block layer, all of the VM crashes. Now when we
>>      have a separated access to disks (this is also an issue for use of
>>      the qemu-storage-daemon) if any of the helper processes crash we get
>>      into a new situation.
>>
>>      I think we'll need to think about:
>>          - adding an event for any of the helper processes failing
>>          - adding a lifecycle action for it (e.g. pause qemu if nbdkit
>>            dies)
>>          - think about possible recovery of the situation
> 
> Indeed. Seems like something re-usable that could also be used for 
> qemu-storage-daemon might be useful. I'll look into it.
> 
>>
>>   - resource pinning
>>     For now all the resources are integral to the qemu process so
>>     emulator and iothread pinning can be used to steer which cpus the
>>     disk should use. With us adding new possibly cpu intensive processes
>>     we'll probably need to consider how to handle them more generally and
>>     manage their resources.
> 
> Good point. I hadn't really thought about this.
> 
>>
>>   - Integration with qemu storage daemon used for a VM
>>
>>     With the attempt to rewrite QSD in other languages it will possibly
>>     make sense to run it instead of the native qemu block layer (e.g.
>>     take advantage of memory safe languages). So we should also think
>>     about how these two will be able to coexist.
>>
>> The last point is more of a future-work thing to consider, but the
>> first two points should be considered for the final release of this
>> feature. Specifically because in your current design it replaces the
>> in-qemu driver even in cases when it is compiled into qemu thus also for
>> existing users. Alternatively it would have to be opt-in.
> 
> As far as I can tell, this is currently no good way to determine (via 
> capabilities or otherwise) whether e.g. the curl driver is compiled into 
> a particular qemu binary. If I've missed something, please let me know. 
> My recollection of previous discussions was that I should try to use 
> nbdkit if available and then fall back to trying the built-in qemu 
> driver approach. But we could change this strategy if there's a good way 
> to do it. When you talk about opt-in, do you mean per-domain? e.g. in 
> the xml? or some other mechanism?
> 
>>
>>> Known shortcomings
>>>   - creating disks (in ssh) still isn't supported. I wanted to send 
>>> out the
>>>     patch series anyway since it's been delayed too long already.
>>
>> That shouldn't be a problem, there's plenty protocols where we don't
>> support creating the storage. Creating storage is needed only for
>> snapshots so we can simply refuse to do it in the first place.
>>
>