From nobody Sun May 5 00:36:34 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of redhat.com designates 205.139.110.61 as permitted sender) client-ip=205.139.110.61; envelope-from=libvir-list-bounces@redhat.com; helo=us-smtp-delivery-1.mimecast.com; Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of redhat.com designates 205.139.110.61 as permitted sender) smtp.mailfrom=libvir-list-bounces@redhat.com; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1591011879; cv=none; d=zohomail.com; s=zohoarc; b=JfENUp6w7nD/9Akspz949Dl+y/22iSpM4q8E4/2lLvxIms/JD3AQHTlCHWMWq9Um/17ZXlTOqL/FWwbCa2hYfE+rMiqytguznPL2bpQ7ptR8IS1DTrW3weSRxp3F++uzlO3/2lgc7L29lWUr/I4KrYRcREFvNMLINYOeog3vel4= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1591011879; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:Sender:Subject:To; bh=nCd6jvKgKDjqJ8Z3wq+SuJcSRNESaHEUGqNfjDIht3U=; b=C/WL+WQo1l8QtRj8UuB2J9C9Vmtcq6oKriYk1AIBbSvLHwUzzcAsKZBlZrRHN6nzKykWBn3+xzyhpB+29BBiMbCu2UxWuHyGwpOAkfqZQGa6xYNvsffA7IClRiGZFmmC13xTeYSO8ymWo2jXwe5Ld7V3nV7OIyT6sFH5xZUaDh8= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of redhat.com designates 205.139.110.61 as permitted sender) smtp.mailfrom=libvir-list-bounces@redhat.com; dmarc=pass header.from= (p=none dis=none) header.from= Return-Path: Received: from us-smtp-delivery-1.mimecast.com (us-smtp-1.mimecast.com [205.139.110.61]) by mx.zohomail.com with SMTPS id 1591011879642442.4817513632569; Mon, 1 Jun 2020 04:44:39 -0700 (PDT) Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-388-MaQTm94nPPGjLP55RusCOg-1; Mon, 01 Jun 2020 07:44:35 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 767F1835B40; Mon, 1 Jun 2020 11:44:30 +0000 (UTC) Received: from colo-mx.corp.redhat.com (colo-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.21]) by smtp.corp.redhat.com (Postfix) with ESMTPS id BF572768D2; Mon, 1 Jun 2020 11:44:29 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id 59FA67F213; Mon, 1 Jun 2020 11:44:27 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id 051BiQJl008352 for ; Mon, 1 Jun 2020 07:44:26 -0400 Received: by smtp.corp.redhat.com (Postfix) id 1C16F60F8D; Mon, 1 Jun 2020 11:44:26 +0000 (UTC) Received: from localhost.localdomain.com (unknown [10.36.110.33]) by smtp.corp.redhat.com (Postfix) with ESMTP id C4EC2610AB; Mon, 1 Jun 2020 11:44:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1591011878; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:list-id:list-help: list-unsubscribe:list-subscribe:list-post; bh=nCd6jvKgKDjqJ8Z3wq+SuJcSRNESaHEUGqNfjDIht3U=; b=QTp81Ckt1RE0lmMuMm7Yth6gMtuPNDD1KPh9ycwHxxN4M2ctXZ1WyQf0URPdP6CamKXiVX 1RuMnFWHeFAlODsFiTdG7braOVtlfSqpjsTTAK/r4fTJnr6FotdbquVuqK7wUNM4COt96r stXgFEYbfYEB1DCrlr56Fw/31spyVO0= X-MC-Unique: MaQTm94nPPGjLP55RusCOg-1 From: =?UTF-8?q?Daniel=20P=2E=20Berrang=C3=A9?= To: libvir-list@redhat.com Subject: [libvirt PATCH] docs: add kbase entry showing KVM real time guest config Date: Mon, 1 Jun 2020 12:44:17 +0100 Message-Id: <20200601114417.2863070-1-berrange@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-loop: libvir-list@redhat.com Cc: Luiz Capitulino X-BeenThere: libvir-list@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk List-Id: Development discussions about the libvirt library & tools List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: libvir-list-bounces@redhat.com Errors-To: libvir-list-bounces@redhat.com X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-ZohoMail-DKIM: pass (identity @redhat.com) There are many different settings that required to config a KVM guest for real time, low latency workoads. The documentation included here is based on guidance developed & tested by the Red Hat KVM real time team. Signed-off-by: Daniel P. Berrang=C3=A9 Reviewed-by: Jiri Denemark --- docs/kbase.html.in | 3 + docs/kbase/kvm-realtime.rst | 213 ++++++++++++++++++++++++++++++++++++ 2 files changed, 216 insertions(+) create mode 100644 docs/kbase/kvm-realtime.rst diff --git a/docs/kbase.html.in b/docs/kbase.html.in index c586e0f676..e663ca525f 100644 --- a/docs/kbase.html.in +++ b/docs/kbase.html.in @@ -36,6 +36,9 @@ =20
Virtio-FS
Share a filesystem between the guest and the host
+ +
KVM real time
+
Run real time workloads in guests on a KVM hypervisor
=20 diff --git a/docs/kbase/kvm-realtime.rst b/docs/kbase/kvm-realtime.rst new file mode 100644 index 0000000000..ac6102879b --- /dev/null +++ b/docs/kbase/kvm-realtime.rst @@ -0,0 +1,213 @@ +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D +KVM Real Time Guest Config +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D + +.. contents:: + +The KVM hypervisor is capable of running real time guest workloads. This p= age +describes the key pieces of configuration required in the domain XML to ac= hieve +the low latency needs of real time workloads. + +For the most part, configuration of the host OS is out of scope of this +documentation. Refer to the operating system vendor's guidance on configur= ing +the host OS and hardware for real time. Note in particular that the default +kernel used by most Linux distros is not suitable for low latency real tim= e and +must be replaced by an special kernel build. + + +Host partitioning plan +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +Running real time workloads requires carefully partitioning up the host OS +resources, such that the KVM / QEMU processes are strictly separated from = any +other workload running on the host, both userspace processes and kernel th= reads. + +As such, some subset of host CPUs need to be reserved exclusively for runn= ing +KVM guests. This requires that the host kernel be booted using the ``isolc= pus`` +kernel command line parameter. This parameter removes a set of CPUs from t= he +schedular, such that that no kernel threads or userspace processes will ev= er get +placed on those CPUs automatically. KVM guests are then manually placed on= to +these CPUs. + +Deciding which host CPUs to reserve for real time requires understanding o= f the +guest workload needs and balancing with the host OS needs. The trade off w= ill +also vary based on the physical hardware available. + +For the sake of illustration, this guide will assume a physical machine wi= th two +NUMA nodes, each with 2 sockets and 4 cores, giving a total of 16 CPUs on = the +host. Furthermore, it is assumed that hyperthreading is either not support= ed or +has been disabled in the BIOS, since it is incompatible with real time. Ea= ch +NUMA node is assumed to have 32 GB of RAM, giving 64 GB total for the host. + +It is assumed that 2 CPUs in each NUMA node are reserved for the host OS, = with +the remaining 6 CPUs available for KVM real time. With this in mind, the h= ost +kernel should have booted with ``isolcpus=3D2-7,10,15`` to reserve CPUs. + +To maximise efficiency of page table lookups for the guest, the host needs= to be +configured with most RAM exposed as huge pages, ideally 1 GB sized. 6 GB o= f RAM +in each NUMA node will be reserved for general host OS usage as normal siz= ed +pages, leaving 26 GB for KVM usage as huge pages. + +Once huge pages are reserved on the hypothetical machine, the ``virsh +capabilities`` command output is expected to look approximately like: + +:: + + + + + 33554432 + 1572864 + 0 + 26 + + + + + + + + + + + + + + + + + 33554432 + 1572864 + 0 + 26 + + + + + + + + + + + + + + + + + + +Be aware that CPU ID numbers are not always allocated sequentially as shown +here. It is not unusual to see IDs interleaved between sockets on the two = NUMA +nodes, such that ``0-3,8-11`` are be on the first node and ``4-7,12-15`` a= re on +the second node. Carefully check the ``virsh capabilities`` output to det= ermine +the CPU ID numbers when configiring both ``isolcpus`` and the guest ``cpus= et`` +values. + +Guest configuration +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +What follows is an overview of the key parts of the domain XML that need t= o be +configured to achieve low latency for real time workflows. The following e= xample +will assume a 4 CPU guest, requiring 16 GB of RAM. It is intended to be pl= aced +on the second host NUMA node. + +CPU configuration +----------------- + +Real time KVM guests intended to run Linux should have a minimum of 2 CPUs. +One vCPU is for running non-real time processes and performing I/O. The ot= her +vCPUs will run real time applications. Some non-Linux OS may not require a +special non-real time CPU to be available, in which case the 2 CPU minimum= would +not apply. + +Each guest CPU, even the non-real time one, needs to be pinned to a dedica= ted +host core that is in the `isolcpus` reserved set. The QEMU emulator threads +also need to be pinned to host CPUs that are not in the `isolcpus` reserve= d set. +The vCPUs need to be given a real time CPU schedular policy. + +When configuring the `guest CPU count <../formatdomain.html#elementsCPUAll= ocation>`_, +do not include any CPU affinity are this stage: + +:: + + 4 + +The guest CPUs now need to be placed individually. In this case, they will= all +be put within the same host socket, such that they can be exposed as core +siblings. This is achieved using the `CPU tunning config <../formatdomain.= html#elementsCPUTuning>`_: + +:: + + + + + + + + + + +The `guest CPU model `_ now needs to be +configured to pass through the host model unchanged, with topology matchin= g the +placement: + +:: + + + + + + +The performance monitoring unit virtualization needs to be disabled +via the `hypervisor features <../formatdomain.html#elementsFeatures>`_: + +:: + + + ... + + + + +Memory configuration +-------------------- + +The host memory used for guest RAM needs to be allocated from huge pages o= n the +second NUMA node, and all other memory allocation needs to be locked into = RAM +with memory page sharing disabled. +This is achieved by using the `memory backing config `_: + +:: + + + + + + + + + + +Device configuration +-------------------- + +Libvirt adds a few devices by default to maintain historical QEMU configur= ation +behaviour. It is unlikely these devices are required by real time guests, = so it +is wise to disable them. Remove all USB controllers that may exist in the = XML +config and replace them with: + +:: + + + +Similarly the memory balloon config should be changed to + +:: + + + +If the guest had a graphical console at installation time this can also be +disabled, with remote access being over SSH, with a minimal serial console +for emergencies. --=20 2.26.2