From nobody Thu Jun 13 22:57:21 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of redhat.com designates 170.10.133.124 as permitted sender) client-ip=170.10.133.124; envelope-from=libvir-list-bounces@redhat.com; helo=us-smtp-delivery-124.mimecast.com; Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=libvir-list-bounces@redhat.com; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1681466558; cv=none; d=zohomail.com; s=zohoarc; b=fvjuMteZmJCVPJMTHs/2gddJGrFYmu5HBNunJXuW8b7J+PVYl6Wu43bogSFHSeU/fY2LmvEVjcRmj5pUrJ3NDUaVOVcMRUdJWOVtbdHc/DQs4Zf1C8T65Dv8SakgvVPr1ZMaZuH+vK4R6M8vTcMejVGbjOtJ/JLKxM99D0hwZcg= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1681466558; h=Content-Type:Content-Transfer-Encoding:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=BrhG5gDkughSGfc8/betSAkKHgK3U0T32LvNGSZ5hss=; b=NvR3RGKPVGZv5qffv1rCH03cWz8BGYFvvXqnlYnMs5ZP3K+9IUFbrA5jLWgDSDXlLaTNJ06N0zE54gvKjUy4+CinXXxHw+naOMp6ZDq5geItRUy7puTPITLcSoNp8KhKwmtxNWql5XhKQLvj6F8iR1EnywAGIa4TgYRCCrzPVa4= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=libvir-list-bounces@redhat.com; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mx.zohomail.com with SMTPS id 16814665582811001.8477090096387; Fri, 14 Apr 2023 03:02:38 -0700 (PDT) Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-584-zQx0VXcdPlWcsmBfj2NGeA-1; Fri, 14 Apr 2023 06:02:35 -0400 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 7280285531D; Fri, 14 Apr 2023 10:02:33 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (unknown [10.30.29.100]) by smtp.corp.redhat.com (Postfix) with ESMTP id F3AAF492B01; Fri, 14 Apr 2023 10:02:32 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (localhost [IPv6:::1]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id AC45F19472D5; Fri, 14 Apr 2023 10:02:31 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id 35A191946A7F for ; Fri, 14 Apr 2023 10:02:30 +0000 (UTC) Received: by smtp.corp.redhat.com (Postfix) id 1BDD0492B03; Fri, 14 Apr 2023 10:02:30 +0000 (UTC) Received: from localhost.localdomain (unknown [10.43.2.39]) by smtp.corp.redhat.com (Postfix) with ESMTP id B93C1492B01 for ; Fri, 14 Apr 2023 10:02:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1681466557; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post; bh=BrhG5gDkughSGfc8/betSAkKHgK3U0T32LvNGSZ5hss=; b=B/E8ti2rgSzFbgqvtaNr6MgQmfu0jgsQwIYU6imyCdGP/Umh/UL8XwR3J7RSEdqSjI96Hv CYljBYZbLoKvrqB7Dykz2wW5yB8jD96UfHdS1fHfpmv3hFz2L6pWnh7bXdGsLqAoGVYFfQ QBZvdY73ky+jQOG77V5CtKYUKyOG3FA= X-MC-Unique: zQx0VXcdPlWcsmBfj2NGeA-1 X-Original-To: libvir-list@listman.corp.redhat.com From: Michal Privoznik To: libvir-list@redhat.com Subject: [PATCH 1/2] qemu_domain: Increase memlock limit for NVMe disks Date: Fri, 14 Apr 2023 12:02:26 +0200 Message-Id: <21d598ce185e322e78af9e182a3ae89867874613.1681466531.git.mprivozn@redhat.com> In-Reply-To: References: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.10 X-BeenThere: libvir-list@redhat.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Development discussions about the libvirt library & tools List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libvir-list-bounces@redhat.com Sender: "libvir-list" X-Scanned-By: MIMEDefang 3.1 on 10.11.54.10 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1681466558546100001 Content-Type: text/plain; charset="utf-8"; x-default="true" When starting QEMU, or when hotplugging a PCI device QEMU might lock some memory. How much? Well, that's a undecidable problem: a Turing machine that halts, halts in an finite number of steps, and thus it can move tape only so many times. Now, does given TM halt? QED. But despite that, we try to guess. And it more or less works, until there's a counter example. This time, it's a guest with both and an NVMe . I've started a simple guest with 4GiB of memory: # virsh dominfo fedora Max memory: 4194304 KiB Used memory: 4194304 KiB And here are the amounts of memory that QEMU tried to lock, obtained via: grep VmLck /proc/$(pgrep qemu-kvm)/status 1) with just one VmLck: 4194308 kB 2) with just one NVMe VmLck: 4328544 kB 3) with one and one NVMe VmLck: 8522852 kB Now, what's surprising is case 2) where the locked memory exceeds the VM memory. It almost resembles VDPA. Therefore, treat is as such. Unfortunately, I don't have a box with two or more spare NVMe-s so I can't tell for sure. But setting limit too tight means QEMU refuses to start. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=3D2014030 Signed-off-by: Michal Privoznik Reviewed-by: Martin Kletzander --- src/qemu/qemu_domain.c | 35 +++++++++++++++++++++++++---------- 1 file changed, 25 insertions(+), 10 deletions(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 63b13b6875..41db98880c 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -9532,7 +9532,7 @@ getPPC64MemLockLimitBytes(virDomainDef *def, =20 =20 static int -qemuDomainGetNumVFIODevices(const virDomainDef *def) +qemuDomainGetNumVFIOHostdevs(const virDomainDef *def) { size_t i; int n =3D 0; @@ -9542,10 +9542,22 @@ qemuDomainGetNumVFIODevices(const virDomainDef *def) virHostdevIsMdevDevice(def->hostdevs[i])) n++; } + + return n; +} + + +static int +qemuDomainGetNumNVMeDisks(const virDomainDef *def) +{ + size_t i; + int n =3D 0; + for (i =3D 0; i < def->ndisks; i++) { if (virStorageSourceChainHasNVMe(def->disks[i]->src)) n++; } + return n; } =20 @@ -9585,6 +9597,7 @@ qemuDomainGetMemLockLimitBytes(virDomainDef *def, { unsigned long long memKB =3D 0; int nvfio; + int nnvme; int nvdpa; =20 /* prefer the hard limit */ @@ -9604,7 +9617,8 @@ qemuDomainGetMemLockLimitBytes(virDomainDef *def, if (ARCH_IS_PPC64(def->os.arch) && def->virtType =3D=3D VIR_DOMAIN_VIR= T_KVM) return getPPC64MemLockLimitBytes(def, forceVFIO); =20 - nvfio =3D qemuDomainGetNumVFIODevices(def); + nvfio =3D qemuDomainGetNumVFIOHostdevs(def); + nnvme =3D qemuDomainGetNumNVMeDisks(def); nvdpa =3D qemuDomainGetNumVDPANetDevices(def); /* For device passthrough using VFIO the guest memory and MMIO memory * regions need to be locked persistent in order to allow DMA. @@ -9624,16 +9638,17 @@ qemuDomainGetMemLockLimitBytes(virDomainDef *def, * * Note that this may not be valid for all platforms. */ - if (forceVFIO || nvfio || nvdpa) { + if (forceVFIO || nvfio || nnvme || nvdpa) { /* At present, the full memory needs to be locked for each VFIO / = VDPA - * device. For VFIO devices, this only applies when there is a vIO= MMU - * present. Yes, this may result in a memory limit that is greater= than - * the host physical memory, which is not ideal. The long-term sol= ution - * is a new userspace iommu interface (iommufd) which should elimi= nate - * this duplicate memory accounting. But for now this is the only = way - * to enable configurations with e.g. multiple vdpa devices. + * NVMe device. For VFIO devices, this only applies when there is a + * vIOMMU present. Yes, this may result in a memory limit that is + * greater than the host physical memory, which is not ideal. The + * long-term solution is a new userspace iommu interface (iommufd) + * which should eliminate this duplicate memory accounting. But fo= r now + * this is the only way to enable configurations with e.g. multiple + * VDPA/NVMe devices. */ - int factor =3D nvdpa; + int factor =3D nvdpa + nnvme; =20 if (nvfio || forceVFIO) { if (nvfio && def->iommu) --=20 2.39.2 From nobody Thu Jun 13 22:57:21 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of redhat.com designates 170.10.133.124 as permitted sender) client-ip=170.10.133.124; envelope-from=libvir-list-bounces@redhat.com; helo=us-smtp-delivery-124.mimecast.com; Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=libvir-list-bounces@redhat.com; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1681466588; cv=none; d=zohomail.com; s=zohoarc; b=doQfaif3HROG2hkJ+zqGb4oHkeuQlRsejUd03zWKuaGrP/I9iVkRQhdT3a686ireYwAzI+wOvh8IfWbOMtzvzK7ITZhNUCljDIiC+w7dcLIi5uIOSMe/2y8nuKe7sIoRkK+oKWDhWOIfp9kfnlW1C2iiomKUTUaxmWgtxEOa0U4= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1681466588; h=Content-Type:Content-Transfer-Encoding:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=zZHCsNFgVJE1wRCd6p4Pn/D8uvrxXxTLcaayULm+bQ4=; b=YeSEvxypL1KuI6nPYpTV64bG1jgiC2AdlR/VFU+grMAb0eKM6CAwa4ex8IqPb39SxSFdX8yxGsQCjRKK3dah/4yQuWSkVxEYi5BXyCDA4bGLxu7CL7UoX3OzpC18byM0MXqXj6LYEnO0bg/V1vliTTnz7FAnyCcIvBD2gUSYlm4= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=libvir-list-bounces@redhat.com; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mx.zohomail.com with SMTPS id 1681466588157498.05038468440114; Fri, 14 Apr 2023 03:03:08 -0700 (PDT) Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-588-mmwNUuQEOdK_X3Wc1mSYNg-1; Fri, 14 Apr 2023 06:02:36 -0400 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id C0B06884360; Fri, 14 Apr 2023 10:02:33 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (unknown [10.30.29.100]) by smtp.corp.redhat.com (Postfix) with ESMTP id A1B0340F2D4C; Fri, 14 Apr 2023 10:02:33 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (localhost [IPv6:::1]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id DE97819472E8; Fri, 14 Apr 2023 10:02:31 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id B4B4B1946A7F for ; Fri, 14 Apr 2023 10:02:30 +0000 (UTC) Received: by smtp.corp.redhat.com (Postfix) id A8353492B03; Fri, 14 Apr 2023 10:02:30 +0000 (UTC) Received: from localhost.localdomain (unknown [10.43.2.39]) by smtp.corp.redhat.com (Postfix) with ESMTP id 52012492B01 for ; Fri, 14 Apr 2023 10:02:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1681466587; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post; bh=zZHCsNFgVJE1wRCd6p4Pn/D8uvrxXxTLcaayULm+bQ4=; b=Mvyl0PTZoRojcNtzuzktn8R8vmn4Yqx4/nynJfQtTzEyWksfyOABHbwUJ60xGhgvSTE4Ug KJ/Ytz61G+kdgCk9jVgh3wEN+a7wfUDXjtEsvTAF3fCztDjUGgkAf2tgBMsLe+SavOOYwl uP1zSFLPju4RNRf0jkAiHHQfd4AzQDo= X-MC-Unique: mmwNUuQEOdK_X3Wc1mSYNg-1 X-Original-To: libvir-list@listman.corp.redhat.com From: Michal Privoznik To: libvir-list@redhat.com Subject: [PATCH 2/2] qemumemlocktest: Introduce pc-hostdev-nvme test case Date: Fri, 14 Apr 2023 12:02:27 +0200 Message-Id: <47d1b4e0a6c79eaeb4a4a6f674f0abe2f3d8329c.1681466531.git.mprivozn@redhat.com> In-Reply-To: References: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.10 X-BeenThere: libvir-list@redhat.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Development discussions about the libvirt library & tools List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libvir-list-bounces@redhat.com Sender: "libvir-list" X-Scanned-By: MIMEDefang 3.1 on 10.11.54.1 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1681466588603100003 Content-Type: text/plain; charset="utf-8"; x-default="true" This basically just demonstrates the limit raised by previous commit. Signed-off-by: Michal Privoznik Reviewed-by: Martin Kletzander --- .../qemumemlock-pc-hostdev-nvme.xml | 24 +++++++++++++++++++ tests/qemumemlocktest.c | 1 + 2 files changed, 25 insertions(+) create mode 100644 tests/qemumemlockdata/qemumemlock-pc-hostdev-nvme.xml diff --git a/tests/qemumemlockdata/qemumemlock-pc-hostdev-nvme.xml b/tests/= qemumemlockdata/qemumemlock-pc-hostdev-nvme.xml new file mode 100644 index 0000000000..06f1496970 --- /dev/null +++ b/tests/qemumemlockdata/qemumemlock-pc-hostdev-nvme.xml @@ -0,0 +1,24 @@ + + guest + 1048576 + 1 + + hvm + + + /usr/bin/qemu-system-x86_64 + + + +
+ + + + + + +
+ + + + diff --git a/tests/qemumemlocktest.c b/tests/qemumemlocktest.c index 61b73e1d79..c53905a7dd 100644 --- a/tests/qemumemlocktest.c +++ b/tests/qemumemlocktest.c @@ -97,6 +97,7 @@ mymain(void) DO_TEST("pc-hardlimit", 2147483648); DO_TEST("pc-locked", VIR_DOMAIN_MEMORY_PARAM_UNLIMITED); DO_TEST("pc-hostdev", 2147483648); + DO_TEST("pc-hostdev-nvme", 3221225472); =20 DO_TEST("pc-hardlimit+locked", 2147483648); DO_TEST("pc-hardlimit+hostdev", 2147483648); --=20 2.39.2