From nobody Sun Feb 8 14:13:07 2026 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of redhat.com designates 207.211.31.120 as permitted sender) client-ip=207.211.31.120; envelope-from=libvir-list-bounces@redhat.com; helo=us-smtp-1.mimecast.com; Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of redhat.com designates 207.211.31.120 as permitted sender) smtp.mailfrom=libvir-list-bounces@redhat.com; dmarc=fail(p=none dis=none) header.from=gmail.com ARC-Seal: i=1; a=rsa-sha256; t=1591033875; cv=none; d=zohomail.com; s=zohoarc; b=dsx5ucBJYQS/O9a2mGSic+/s0mhvpQ+Bq4Abw0X3/NrdKDTIEyQmTwO0jX2O9jRu5xjYAQEG63++fQsC4W0u4BTizbHs6KjzuMZsua3nFiF/4K/0l1kO28JI4DUt1qQG2kg7oK2TJxBc7Eovzn0zendS247xdpj2JkPOvIT1M2I= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1591033875; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=RIolC+iH9R6ampV8D/SN6MmrVjo6v4jwailiug8vXRk=; b=jIv15vBSzo2J+OpKi4mVl74RYMXJ4qy6nsSp6OL+NbeFh2sWenrtB3Lshnl7E7+f3iYX7gANaSgmt0rY41VwLLlxrSfE3pU0O0/Q4vAZF+YbTWOd50WiHyiDfjBABU4pUYp4oPuyd0WMySBo20DhcRI8QLmCq+Dm4EEzRFAYsIc= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of redhat.com designates 207.211.31.120 as permitted sender) smtp.mailfrom=libvir-list-bounces@redhat.com; dmarc=fail header.from= (p=none dis=none) header.from= Return-Path: Received: from us-smtp-1.mimecast.com (us-smtp-delivery-1.mimecast.com [207.211.31.120]) by mx.zohomail.com with SMTPS id 1591033875741519.1360205448377; Mon, 1 Jun 2020 10:51:15 -0700 (PDT) Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-421-9guTtURPNYGHD4Cmm7suzg-1; Mon, 01 Jun 2020 13:51:12 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 6DA7B800053; Mon, 1 Jun 2020 17:51:06 +0000 (UTC) Received: from colo-mx.corp.redhat.com (colo-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.21]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 819BCD01E1; Mon, 1 Jun 2020 17:51:05 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id C7B9E9338A; Mon, 1 Jun 2020 17:51:01 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id 051Hox1D009379 for ; Mon, 1 Jun 2020 13:50:59 -0400 Received: by smtp.corp.redhat.com (Postfix) id 8F1F52028DCC; Mon, 1 Jun 2020 17:50:59 +0000 (UTC) Received: from mimecast-mx02.redhat.com (mimecast03.extmail.prod.ext.rdu2.redhat.com [10.11.55.19]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 87A802026FFE for ; Mon, 1 Jun 2020 17:50:57 +0000 (UTC) Received: from us-smtp-1.mimecast.com (us-smtp-delivery-1.mimecast.com [207.211.31.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 379CF800FFE for ; Mon, 1 Jun 2020 17:50:57 +0000 (UTC) Received: from mail-qv1-f53.google.com (mail-qv1-f53.google.com [209.85.219.53]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-463-7Hgx9iMTMhWXPnSXdM1jeA-1; Mon, 01 Jun 2020 13:50:55 -0400 Received: by mail-qv1-f53.google.com with SMTP id g7so416382qvx.11; Mon, 01 Jun 2020 10:50:54 -0700 (PDT) Received: from rekt.ibmuc.com ([2804:431:c7c7:9da7:da03:b92e:aafb:831c]) by smtp.gmail.com with ESMTPSA id h22sm9519021qtk.63.2020.06.01.10.50.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 01 Jun 2020 10:50:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1591033874; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post; bh=RIolC+iH9R6ampV8D/SN6MmrVjo6v4jwailiug8vXRk=; b=KLwFCpLTsK43liW2/jj9UyZuOJEsoX2VhKW4j3aeg6pEFPWTQ1tOa39n9F1oP8rJ9/WQnD styYOvU4IoBS/4EB6mOgjVci2sHliqJi2NfbNIElBrVIH3TJF/WEM2ijEXDoqgxLcBvBYe 7htcBwji9qFUnE4TQY9vTCDH2K45V7o= X-MC-Unique: 9guTtURPNYGHD4Cmm7suzg-1 X-MC-Unique: 7Hgx9iMTMhWXPnSXdM1jeA-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=RIolC+iH9R6ampV8D/SN6MmrVjo6v4jwailiug8vXRk=; b=C/tMjcci+MDGtQUEopt18U9Oh+VIIG9+LZib8O7wEYA7i5dXepQp2gCyjj1lqqwK4d TqQG5U22M1glQbLGYPOgNc9Oy4Hd7hmZwZ5leMZffzWAhzx+8ZxEa5W6CD//1W237Rjd Vuv8TUVnFvmyAWjyQoEjqZPziMCoC/wZ4cKpDAcGk558u7awFoAZ+Pkx46Pxy+mIhBmh 6PrCMeJZkOcTYrQ1jdh1nxa9YWbUDQn8dkYYnptJLAzqU22JlucP4p80RhmI2tSMiPLb vVI+LY+GKZ1aR4Fs6x6zkManyjFkR0kRRMw2Mz62yV4UIlIYM/Ac4iPDFR0IkDsheZkM gqQQ== X-Gm-Message-State: AOAM533nc7tzD5v3UtvaUbJPiy3W8Y029euailXOSUtNZcfaK+VMtVdw SaELyjgWgt4pQTV5YpfjC86mne5KR9w= X-Google-Smtp-Source: ABdhPJwL0U3W7MGauF9qqfe67MuIne9RyrY/qabFG0XnW3u+pp0W79Hv4jBSyii0DuOBsHVrr5MZ5g== X-Received: by 2002:ad4:5608:: with SMTP id ca8mr2645651qvb.221.1591033854143; Mon, 01 Jun 2020 10:50:54 -0700 (PDT) From: Daniel Henrique Barboza To: libvir-list@redhat.com Subject: [PATCH 2/5] qemu_domain.c: NUMA CPUs auto-fill for incomplete topologies Date: Mon, 1 Jun 2020 14:50:38 -0300 Message-Id: <20200601175041.1607723-3-danielhb413@gmail.com> In-Reply-To: <20200601175041.1607723-1-danielhb413@gmail.com> References: <20200601175041.1607723-1-danielhb413@gmail.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.78 on 10.11.54.4 X-loop: libvir-list@redhat.com Cc: Daniel Henrique Barboza , pkrempa@redhat.com X-BeenThere: libvir-list@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk List-Id: Development discussions about the libvirt library & tools List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: libvir-list-bounces@redhat.com Errors-To: libvir-list-bounces@redhat.com X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable X-ZohoMail-DKIM: pass (identity @redhat.com) Content-Type: text/plain; charset="utf-8" Libvirt allows the user to define an incomplete NUMA topology, where the sum of all CPUs in each cell is less than the total of VCPUs. What ends up happening is that QEMU allocates the non-enumerated CPUs in the first NUMA node. This behavior is being flagged as 'to be deprecated' at least since QEMU commit ec78f8114bc4 ("numa: use possible_cpus for not mapped CPUs check"). In [1], Maxiwell suggested that we forbid the user to define such topologies. In his review [2], Peter Krempa pointed out that we can't break existing guests, and suggested that Libvirt should emulate the QEMU behavior of putting the remaining vCPUs in the first NUMA node in these cases. This patch implements Peter Krempa's suggestion. Since we're going to most likely end up with disjointed NUMA configuration in node 0 after the auto-fill, we're making auto-fill dependent on QEMU_CAPS_NUMA. A following patch will update the documentation not just to inform about the auto-fill mechanic with incomplete NUMA topologies, but also to discourage the user to create such topologies in the future. This approach also makes Libvirt independent of whether QEMU changes its current behavior since we're either auto-filling the CPUs in node 0 or the user (hopefully) is aware that incomplete topologies, although supported in Libvirt, are to be avoided. [1] https://www.redhat.com/archives/libvir-list/2019-June/msg00224.html [2] https://www.redhat.com/archives/libvir-list/2019-June/msg00263.html Signed-off-by: Daniel Henrique Barboza --- src/qemu/qemu_domain.c | 47 ++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_domain.h | 4 ++++ src/qemu/qemu_driver.c | 9 ++++++++ 3 files changed, 60 insertions(+) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index d5e3d1a3cc..8034b6a219 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -4953,6 +4953,50 @@ qemuDomainDefTsegPostParse(virDomainDefPtr def, } =20 =20 +/** + * qemuDomainDefNumaCPUsRectify: + * @numa: pointer to numa definition + * @maxCpus: number of CPUs this numa is supposed to have + * + * This function emulates the (to be deprecated) behavior of filling + * up in node0 with the remaining CPUs, in case of an incomplete NUMA + * setup, up to getVcpusMax. + * + * Returns: 0 on success, -1 on error + */ +int +qemuDomainDefNumaCPUsRectify(virDomainDefPtr def, virQEMUCapsPtr qemuCaps) +{ + unsigned int vcpusMax, numacpus; + + /* QEMU_CAPS_NUMA tells us if QEMU is able to handle disjointed + * NUMA CPU ranges. The filling process will create a disjointed + * setup in node0 most of the time. Do not proceed if QEMU + * can't handle it.*/ + if (virDomainNumaGetNodeCount(def->numa) =3D=3D 0 || + !virQEMUCapsGet(qemuCaps, QEMU_CAPS_NUMA)) + return 0; + + vcpusMax =3D virDomainDefGetVcpusMax(def); + numacpus =3D virDomainNumaGetCPUCountTotal(def->numa); + + if (numacpus < vcpusMax) { + if (virDomainNumaFillCPUsInNode(def->numa, 0, vcpusMax) < 0) + return -1; + } + + return 0; +} + + +static int +qemuDomainDefNumaCPUsPostParse(virDomainDefPtr def, + virQEMUCapsPtr qemuCaps) +{ + return qemuDomainDefNumaCPUsRectify(def, qemuCaps); +} + + static int qemuDomainDefPostParseBasic(virDomainDefPtr def, void *opaque G_GNUC_UNUSED) @@ -5039,6 +5083,9 @@ qemuDomainDefPostParse(virDomainDefPtr def, if (qemuDomainDefTsegPostParse(def, qemuCaps) < 0) return -1; =20 + if (qemuDomainDefNumaCPUsPostParse(def, qemuCaps) < 0) + return -1; + return 0; } =20 diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index 41d3f1561d..e78a2b935d 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -1297,3 +1297,7 @@ qemuDomainInitializePflashStorageSource(virDomainObjP= tr vm); bool qemuDomainDiskBlockJobIsSupported(virDomainObjPtr vm, virDomainDiskDefPtr disk); + +int +qemuDomainDefNumaCPUsRectify(virDomainDefPtr def, + virQEMUCapsPtr qemuCaps); diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index dd9ae30bb5..9f4f11f15c 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -4999,6 +4999,7 @@ qemuDomainSetVcpusMax(virQEMUDriverPtr driver, unsigned int nvcpus) { g_autoptr(virQEMUDriverConfig) cfg =3D virQEMUDriverGetConfig(driver); + g_autoptr(virQEMUCaps) qemuCaps =3D NULL; unsigned int topologycpus; =20 if (def) { @@ -5029,6 +5030,14 @@ qemuDomainSetVcpusMax(virQEMUDriverPtr driver, if (virDomainDefSetVcpusMax(persistentDef, nvcpus, driver->xmlopt) < 0) return -1; =20 + /* re-adjust NUMA nodes if needed */ + if (!(qemuCaps =3D virQEMUCapsCacheLookup(driver->qemuCapsCache, + persistentDef->emulator))) + return -1; + + if (qemuDomainDefNumaCPUsRectify(persistentDef, qemuCaps) < 0) + return -1; + if (virDomainDefSave(persistentDef, driver->xmlopt, cfg->configDir) < = 0) return -1; =20 --=20 2.26.2