From nobody Thu Mar 28 14:57:06 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of redhat.com designates 207.211.31.81 as permitted sender) client-ip=207.211.31.81; envelope-from=libvir-list-bounces@redhat.com; helo=us-smtp-1.mimecast.com; Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of redhat.com designates 207.211.31.81 as permitted sender) smtp.mailfrom=libvir-list-bounces@redhat.com; dmarc=pass(p=none dis=none) header.from=redhat.com Return-Path: Received: from us-smtp-1.mimecast.com (us-smtp-2.mimecast.com [207.211.31.81]) by mx.zohomail.com with SMTPS id 1580994392545823.5713486547875; Thu, 6 Feb 2020 05:06:32 -0800 (PST) Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-19-Q9LtjvLbPo-CmXdFCPeQyw-1; Thu, 06 Feb 2020 08:05:53 -0500 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id D362218B5FA0; Thu, 6 Feb 2020 13:05:47 +0000 (UTC) Received: from colo-mx.corp.redhat.com (colo-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.21]) by smtp.corp.redhat.com (Postfix) with ESMTPS id A6F5F5DA7D; Thu, 6 Feb 2020 13:05:47 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id 37F7786045; Thu, 6 Feb 2020 13:05:47 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id 016D5jW2022563 for ; Thu, 6 Feb 2020 08:05:45 -0500 Received: by smtp.corp.redhat.com (Postfix) id 89DFE863A5; Thu, 6 Feb 2020 13:05:45 +0000 (UTC) Received: from localhost.localdomain.com (ovpn-112-65.ams2.redhat.com [10.36.112.65]) by smtp.corp.redhat.com (Postfix) with ESMTP id BB3FB859A0; Thu, 6 Feb 2020 13:05:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1580994389; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:list-id:list-help: list-unsubscribe:list-subscribe:list-post; bh=Rb10d60638fXv/XJ4Jr3Vvh4fviDA5mD68oyQC3KQcE=; b=D/nx2UTAUfu+eDDGgn8QF3zJsHR2flzXtwzvOTt5RR3LCfMlzimt2KPNM/qdM4vkMnojO0 UneqVX5mV2VvMah5wZ0LtFWSlSIJF92IgKMaLrHLkhuzOF6De3cdFc0dAbCMnLpSyQrwAM 9E12k4JYaDnMUzSt+WwUjah+mFA94vk= From: =?UTF-8?q?Daniel=20P=2E=20Berrang=C3=A9?= To: libvir-list@redhat.com Subject: [libvirt PATCH] docs: add a kbase explaining security protections for QEMU passthrough Date: Thu, 6 Feb 2020 13:05:37 +0000 Message-Id: <20200206130537.2397155-1-berrange@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-loop: libvir-list@redhat.com X-BeenThere: libvir-list@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk List-Id: Development discussions about the libvirt library & tools List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: libvir-list-bounces@redhat.com Errors-To: libvir-list-bounces@redhat.com X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-MC-Unique: Q9LtjvLbPo-CmXdFCPeQyw-1 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable X-ZohoMail-DKIM: pass (identity @redhat.com) Content-Type: text/plain; charset="utf-8" When using command line passthrough users will often trip up over the security protections like SELinux, DAC, namespaces, etc which will deny access to files they are passing. This document explains the various protections and how to deal with their policy, and/or how to disable them. Signed-off-by: Daniel P. Berrang=C3=A9 Reviewed-by: J=C3=A1n Tomko Reviewed-by: Kashyap Chamarthy --- docs/kbase.html.in | 4 + docs/kbase/qemu-passthrough-security.rst | 157 +++++++++++++++++++++++ 2 files changed, 161 insertions(+) create mode 100644 docs/kbase/qemu-passthrough-security.rst diff --git a/docs/kbase.html.in b/docs/kbase.html.in index c156414c41..db84b95b60 100644 --- a/docs/kbase.html.in +++ b/docs/kbase.html.in @@ -29,6 +29,10 @@
Backing chain management=
Explanation of how disk backing chain specification impacts li= bvirt's behaviour and basic troubleshooting steps of disk problems.
+ +
Security with= QEMU passthrough
+
Examination of the security protections used for QEMU and how = they need + configuring to allow use of QEMU passthrough with host files/dev= ices.
=20 diff --git a/docs/kbase/qemu-passthrough-security.rst b/docs/kbase/qemu-pas= sthrough-security.rst new file mode 100644 index 0000000000..7fb1f6fbdd --- /dev/null +++ b/docs/kbase/qemu-passthrough-security.rst @@ -0,0 +1,157 @@ +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D +QEMU command line passthrough +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D + +.. contents:: + +Libvirt aims to provide explicit modelling of virtualization features in +the domain XML document schema. QEMU has a very broad range of features +and not all of these can be mapped to elements in the domain XML. Libvirt +would like to reduce the gap to QEMU, however, with finite resources there +will always be cases which aren't covered by the domain XML schema. + + +XML document additions +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +To deal with the problem, libvirt introduced support for command line +passthrough of QEMU arguments. This is achieved by supporting a custom +XML namespace, under which some QEMU driver specific elements are defined. + +The canonical place to declare the namespace is on the top level ```` +element. At the very end of the document, arbitrary command line arguments +can now be added, using the namespace prefix ``qemu:`` + +:: + + + QEMUGuest1 + c7a5fdbd-edaf-9455-926a-d65c16db1809 + ... + + + + + + + + +Note that when an argument takes a value eg ``-newarg parameter``, the arg= ument +and the value must be passed as separate ```` entries. + +Instead of declaring the XML namespace on the top level ```` it is= also +possible to declare it at time of use, which is more convenient for humans +writing the XML documents manually. So the following example is functional= ly +identical: + +:: + + + QEMUGuest1 + c7a5fdbd-edaf-9455-926a-d65c16db1809 + ... + + + + + + + + +Note that when querying the XML from libvirt, it will have been translated= into +the canonical syntax once more with the namespace on the top level element. + +Security confinement / sandboxing +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D + +When libvirt launches a QEMU process it makes use of a number of security +technologies to confine QEMU and thus protect the host from malicious VM +breakouts. + +When configuring security protection, however, libvirt generally needs to = know +exactly which host resources the VM is permitted to access. It gets this +information from the domain XML document. This only works for elements in = the +regular schema, the arguments used with command line passthrough are compl= etely +opaque to libvirt. + +As a result, if command line passthrough is used to expose a file on the h= ost +to QEMU, the security protections will activate and either kill QEMU or de= ny it +access. + +There are two strategies for dealing with this problem, either figure out = what +steps are needed to grant QEMU access to the device, or disable the securi= ty +protections. The former is harder, but more secure, while the latter is s= imple. + +Granting access per VM +---------------------- + +* SELinux - the file on the host needs an SELinux label that will grant ac= cess + to QEMU's ``svirt_t`` policy. + + - Read only access - use the ``virt_content_t`` label + - Shared, write access - use the ``svirt_image_t:s0`` label (ie no MCS + category appended) + - Exclusive, write access - use the ``svirt_image_t:s0:MCS`` label for t= he VM. + The MCS is auto-generatd at boot time, so this may require re-configur= ing + the VM to have a fixed MCS label + +* DAC - the file on the host needs to be readable/writable to the ``qemu`` + user or ``qemu`` group. This can be done by changing the file ownership = to + ``qemu``, or relaxing the permissions to allow world read, or adding file + ACLs to allow access to ``qemu``. + +* Namespaces - a private ``mount`` namespace is used for QEMU by default + which populates a new ``/dev`` with only the device nodes needed by QEMU. + There is no way to augment the set of device nodes ahead of time. + +* Seccomp - libvirt launches QEMU with its built-in seccomp policy enabled= with + ``obsolete=3Ddeny``, ``elevateprivileges=3Ddeny``, ``spawn=3Ddeny`` and + ``resourcecontrol=3Ddeny`` settings active. There is no way to change th= is + policy on a per VM basis + +* Cgroups - a custom cgroup is created per VM and this will either use the + ``devices`` controller or an ``BPF`` rule to whitelist a set of device n= odes. + There is no way to change this policy on a per VM basis. + +Disabling security protection per VM +------------------------------------ + +Some of the security protections can be disabled per-VM: + +* SELinux - in the domain XML the ```` model can be changed to + ``none`` instead of ``selinux``, which will make the VM run unconfined. + +* DAC - in the domain XML an ```` element with the ``dac`` model= can + be added, configured with a user / group account of ``root`` to make QEM= U run + with full privileges + +* Namespaces - there is no way to disable this per VM + +* Seccomp - there is no way to disable this per VM + +* Cgroups - there is no way to disable this per VM + +Disabling security protection host-wide +--------------------------------------- + +As a last resort it is possible to disable security protection host wide w= hich +will affect all virtual machines. These settings are all made in +``/etc/libvirt/qemu.conf`` + +* SELinux - set ``security_default_confied =3D 0`` to make QEMU run unconf= ined by + default, while still allowing explicit opt-in to SELinux for VMs. + +* DAC - set ``user =3D root`` and ``group =3D root`` to make QEMU run as t= he root + account + +* SELinux, DAC - set ``security_driver =3D []`` to entirely disable both t= he + SELinux and DAC security drivers. + +* Namespaces - set ``namespaces =3D []`` to disable use of the ``mount`` + namespaces, causing QEMU to see the normal fully popualated ``dev`` + +* Seccomp - set ``seccomp_sandbox =3D 0`` to disable use of the Seccomp sa= ndboxing + in QEMU + +* Cgroups - set ``cgroup_device_acl`` to include the desired device node, = or + ``cgroup_controllers =3D [...]`` to exclude the ``devices`` controller. --=20 2.24.1