From nobody Sun Apr 28 10:26:08 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1555605055; cv=none; d=zoho.com; s=zohoarc; b=SR9JI+2NGqwLIatGRry8NZ37rp4ZIPauYZ1BG9QCK/KCN4RVciKuK1gCbX3Hda50WGVdC5dnp8JGAsHO5MOAaFu6HalXVobeObTpA0zRlTtoqC61xepaHfL2C/wvjcAsPjPl8K3Y6dR7nQtgqvoSpt8sSbUIHbOx2wXbibafskM= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1555605055; h=Content-Transfer-Encoding:Cc:Date:From:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:Sender:Subject:To:ARC-Authentication-Results; bh=qlCUZa9Ht4pSo5yuHdRmPnXKfmAS+oo0Uhd+EBUCdkw=; b=N08UP5lJ5NLNo/UCDrH4oJrxKVfexsi+AiBVWYzO+6H8TplHPpsJuOruOYy/XMmTjBfMXAU0RtQSlVoLKSLnMAmdLDQq4+t5pIBv8GDrwxAq/8HBUKobyf2EU5UYa3I0ECKs9DC9cRIXO3Z9weqfcFMDWWVlLMqNQYn2C1liObg= ARC-Authentication-Results: i=1; mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail header.from= (p=none dis=none) header.from= Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1555605055853710.5631544448185; Thu, 18 Apr 2019 09:30:55 -0700 (PDT) Received: from localhost ([127.0.0.1]:44097 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hH9w3-0006Nq-JW for importer@patchew.org; Thu, 18 Apr 2019 12:30:51 -0400 Received: from eggs.gnu.org ([209.51.188.92]:56612) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hH9t6-0004MC-EI for qemu-devel@nongnu.org; Thu, 18 Apr 2019 12:27:50 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hH9fE-00017p-D8 for qemu-devel@nongnu.org; Thu, 18 Apr 2019 12:13:30 -0400 Received: from mx1.redhat.com ([209.132.183.28]:34608) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hH9fC-00014d-Su for qemu-devel@nongnu.org; Thu, 18 Apr 2019 12:13:27 -0400 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id B8DF0307C941; Thu, 18 Apr 2019 16:13:24 +0000 (UTC) Received: from localhost (ovpn-116-27.ams2.redhat.com [10.36.116.27]) by smtp.corp.redhat.com (Postfix) with ESMTP id D41601001E9C; Thu, 18 Apr 2019 16:13:12 +0000 (UTC) From: Stefan Hajnoczi To: qemu-devel@nongnu.org Date: Thu, 18 Apr 2019 17:13:11 +0100 Message-Id: <20190418161311.24197-1-stefanha@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.46]); Thu, 18 Apr 2019 16:13:24 +0000 (UTC) Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH] security.rst: add Security Guide to developer docs X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Eduardo Otubo , Peter Maydell , Markus Armbruster , Stefan Hajnoczi , Paolo Bonzini Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Type: text/plain; charset="utf-8" At KVM Forum 2018 I gave a presentation on security in QEMU: https://www.youtube.com/watch?v=3DYAdRf_hwxU8 (video) https://vmsplice.net/~stefan/stefanha-kvm-forum-2018.pdf (slides) This patch adds a security guide to the developer docs. This document covers things that developers should know about security in QEMU. It is just a starting point that we can expand on later. I hope it will be useful as a resource for new contributors and will save code reviewers from explaining the same concepts many times. Signed-off-by: Stefan Hajnoczi --- docs/devel/index.rst | 1 + docs/devel/security.rst | 220 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 221 insertions(+) create mode 100644 docs/devel/security.rst diff --git a/docs/devel/index.rst b/docs/devel/index.rst index ebbab636ce..fd0b5fa387 100644 --- a/docs/devel/index.rst +++ b/docs/devel/index.rst @@ -20,3 +20,4 @@ Contents: stable-process testing decodetree + security diff --git a/docs/devel/security.rst b/docs/devel/security.rst new file mode 100644 index 0000000000..c6a6c9973d --- /dev/null +++ b/docs/devel/security.rst @@ -0,0 +1,220 @@ +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +Security Guide +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +Overview +-------- +This guide covers security topics relevant to developers working on QEMU. = It +includes an explanation of the security requirements that QEMU gives its u= sers, +the architecture of the code, and secure coding practices. + +Security Requirements +--------------------- +QEMU supports many different use cases, some of which have stricter securi= ty +requirements than others. The community has agreed on the overall security +requirements that users may depend on. These requirements define what is +considered supported from a security perspective. + +Virtualization Use Case +~~~~~~~~~~~~~~~~~~~~~~~ +The virtualization use case covers cloud and virtual private server (VPS) +hosting, as well as traditional data center and desktop virtualization. T= hese +use cases rely on hardware virtualization extensions to execute guest code +safely on the physical CPU at close-to-native speed. + +The following entities are **untrusted**, meaning that they may be buggy or +malicious: + +* Guest +* User-facing interfaces (e.g. VNC, SPICE, WebSocket) +* Network protocols (e.g. NBD, live migration) +* User-supplied files (e.g. disk images, kernels, device trees) + +Bugs affecting these entities are evaluated on whether they can cause dama= ge in +real-world use cases and treated as security bugs if this is the case. + +Non-virtualization Use Case +~~~~~~~~~~~~~~~~~~~~~~~~~~~ +The non-virtualization use case covers emulation using the Tiny Code Gener= ator +(TCG). In principle the TCG and device emulation code used in conjunction= with +the non-virtualization use case should meet the same security requirements= as +the virtualization use case. However, for historical reasons much of the +non-virtualization use case code was not written with these security +requirements in mind. + +Bugs affecting the non-virtualization use case are not considered security +bugs at this time. Users with non-virtualization use cases must not rely = on +QEMU to provide guest isolation or any security guarantees. + +Architecture +------------ +This section describes the design principles that ensure the security +requirements are met. + +Guest Isolation +~~~~~~~~~~~~~~~ +Guest isolation is the confinement of guest code to the virtual machine. = When +guest code gains control of execution on the host this is called escaping = the +virtual machine. Isolation also includes resource limits such as CPU, mem= ory, +disk, or network throttling. Guests must be unable to exceed their resour= ce +limits. + +QEMU presents an attack surface to the guest in the form of emulated devic= es. +The guest must not be able to gain control of QEMU. Bugs in emulated devi= ces +could allow malicious guests to gain code execution in QEMU. At this poin= t the +guest has escaped the virtual machine and is able to act in the context of= the +QEMU process on the host. + +Guests often interact with other guests and share resources with them. A +malicious guest must not gain control of other guests or access their data. +Disk image files and network traffic must be protected from other guests u= nless +explicitly shared between them by the user. + +Principle of Least Privilege +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +The principle of least privilege states that each component only has acces= s to +the privileges necessary for its function. In the case of QEMU this means= that +each process only has access to resources belonging to the guest. + +The QEMU process should not have access to any resources that are inaccess= ible +to the guest. This way the guest does not gain anything by escaping into = the +QEMU process since it already has access to those same resources from with= in +the guest. + +Following the principle of least privilege immediately fulfills guest isol= ation +requirements. For example, guest A only has access to its own disk image = file +``a.img`` and not guest B's disk image file ``b.img``. + +In reality certain resources are inaccessible to the guest but must be +available to QEMU to perform its function. For example, host system calls= are +necessary for QEMU but are not exposed to guests. A guest that escapes in= to +the QEMU process can then begin invoking host system calls. + +New features must be designed to follow the principle of least privilege. +Should this not be possible for technical reasons, the security risk must = be +clearly documented so users are aware of the trade-off of enabling the fea= ture. + +Isolation mechanisms +~~~~~~~~~~~~~~~~~~~~ +Several isolation mechanisms are available to realize this architecture of +guest isolation and the principle of least privilege. With the exception = of +Linux seccomp, these mechanisms are all deployed by management tools that +launch QEMU, such as libvirt. They are also platform-specific so they are= only +described briefly for Linux here. + +The fundamental isolation mechanism is that QEMU processes must run as +**unprivileged users**. Sometimes it seems more convenient to launch QEMU= as +root to give it access to host devices (e.g. ``/dev/net/tun``) but this po= ses a +huge security risk. File descriptor passing can be used to give an otherw= ise +unprivileged QEMU process access to host devices without running QEMU as r= oot. + +**SELinux** and **AppArmor** make it possible to confine processes beyond = the +traditional UNIX process and file permissions model. They restrict the QE= MU +process from accessing processes and files on the host system that are not +needed by QEMU. + +**Resource limits** and **cgroup controllers** provide throughput and util= ization +limits on key resources such as CPU time, memory, and I/O bandwidth. + +**Linux namespaces** can be used to make process, file system, and other s= ystem +resources unavailable to QEMU. A namespaced QEMU process is restricted to= only +those resources that were granted to it. + +**Linux seccomp** is available via the QEMU ``--sandbox`` option. It disa= bles +system calls that are not needed by QEMU, thereby reducing the host kernel +attack surface. + +Secure coding practices +----------------------- +At the source code level there are several points to keep in mind. Both +developers and security researchers must be aware of them so that they can +develop safe code and audit existing code properly. + +General Secure C Coding Practices +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Most CVEs (security bugs) reported against QEMU are not specific to +virtualization or emulation. They are simply C programming bugs. Therefo= re +it's critical to be aware of common classes of security bugs. + +There is a wide selection of resources available covering secure C coding.= For +example, the `CERT C Coding Standard += `_ +covers the most important classes of security bugs. + +Instead of describing them in detail here, only the names of the most impo= rtant +classes of security bugs are mentioned: + +* Buffer overflows +* Use-after-free and double-free +* Integer overflows +* Format string vulnerabilities + +Some of these classes of bugs can be detected by analyzers. Static analys= is is +performed regularly by Coverity and the most obvious of these bugs are even +reported by compilers. Dynamic analysis is possible with valgrind, tsan, = and +asan. + +Input Validation +~~~~~~~~~~~~~~~~ +Inputs from the guest or external sources (e.g. network, files) cannot be +trusted and may be invalid. Inputs must be checked before using them in a= way +that could crash the program, expose host memory to the guest, or otherwis= e be +exploitable by an attacker. + +The most sensitive attack surface is device emulation. All hardware regis= ter +accesses and data read from guest memory must be validated. A typical exa= mple +is a device that contains multiple units that are selectable by the guest = via +an index register:: + + typedef struct { + ProcessingUnit unit[2]; + ... + } MyDeviceState; + + static void mydev_writel(void *opaque, uint32_t addr, uint32_t val) + { + MyDeviceState *mydev =3D opaque; + ProcessingUnit *unit; + + switch (addr) { + case MYDEV_SELECT_UNIT: + unit =3D &mydev->unit[val]; <-- this input wasn't validated! + ... + } + } + +If ``val`` is not in range [0, 1] then an out-of-bounds memory access will= take +place when ``unit`` is dereferenced. The code must check that ``val`` is = 0 or +1 and handle the case where it is invalid. + +Unexpected Device Accesses +~~~~~~~~~~~~~~~~~~~~~~~~~~ +The guest may access device registers in unusual orders or at unexpected +moments. Device emulation code must not assume that the guest follows the +typical "theory of operation" presented in driver writer manuals. The gue= st +may make nonsense accesses to device registers such as starting operations +before the device has been fully initialized. + +A related issue is that device emulation code must be prepared for unexpec= ted +device register accesses while asynchronous operations are in progress. A +well-behaved guest might wait for a completion interrupt before accessing +certain device registers. Device emulation code must handle the case wher= e the +guest overwrites registers or submits further requests before an ongoing +request completes. Unexpected accesses must not cause memory corruption or +leaks in QEMU. + +Live migration +~~~~~~~~~~~~~~ +Device state can be saved to disk image files and shared with other users. +Live migration code must validate inputs when loading device state so an +attacker cannot gain control by crafting invalid device states. Device st= ate +is therefore considered untrusted even though it is typically generated by= QEMU +itself. + +Guest Memory Access Races +~~~~~~~~~~~~~~~~~~~~~~~~~ +Guests with multiple vCPUs may modify guest RAM while device emulation cod= e is +running. Device emulation code must copy in descriptors and other guest R= AM +structures and only process the local copy. This prevents +time-of-check-to-time-of-use (TOCTOU) race conditions that could cause QEM= U to +crash when a vCPU thread modifies guest RAM while device emulation is +processing it. --=20 2.20.1