From nobody Mon Sep 16 18:56:35 2024 Delivered-To: importer@patchew.org Received-SPF: none (zohomail.com: 8.43.85.245 is neither permitted nor denied by domain of lists.libvirt.org) client-ip=8.43.85.245; envelope-from=devel-bounces@lists.libvirt.org; helo=lists.libvirt.org; Authentication-Results: mx.zohomail.com; spf=none (zohomail.com: 8.43.85.245 is neither permitted nor denied by domain of lists.libvirt.org) smtp.mailfrom=devel-bounces@lists.libvirt.org; dmarc=fail(p=quarantine dis=none) header.from=akamai.com Return-Path: Received: from lists.libvirt.org (lists.libvirt.org [8.43.85.245]) by mx.zohomail.com with SMTPS id 1706567174077210.52791172136415; Mon, 29 Jan 2024 14:26:14 -0800 (PST) Received: by lists.libvirt.org (Postfix, from userid 996) id 9DE3B1C1E; Mon, 29 Jan 2024 17:26:12 -0500 (EST) Received: from lists.libvirt.org.85.43.8.in-addr.arpa (localhost [IPv6:::1]) by lists.libvirt.org (Postfix) with ESMTP id 6D66917ED; Mon, 29 Jan 2024 17:23:55 -0500 (EST) Received: by lists.libvirt.org (Postfix, from userid 996) id C946E17EF; Mon, 29 Jan 2024 17:23:51 -0500 (EST) Received: from mx0a-00190b01.pphosted.com (mx0a-00190b01.pphosted.com [67.231.149.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.libvirt.org (Postfix) with ESMTPS id 6B3B917A6 for ; Mon, 29 Jan 2024 17:23:50 -0500 (EST) Received: from pps.filterd (m0122332.ppops.net [127.0.0.1]) by mx0a-00190b01.pphosted.com (8.17.1.24/8.17.1.24) with ESMTP id 40TI7NvY013422; Mon, 29 Jan 2024 21:44:46 GMT Received: from prod-mail-ppoint6 (prod-mail-ppoint6.akamai.com [184.51.33.61] (may be forged)) by mx0a-00190b01.pphosted.com (PPS) with ESMTPS id 3vvtrn1yp3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 29 Jan 2024 21:44:45 +0000 (GMT) Received: from pps.filterd (prod-mail-ppoint6.akamai.com [127.0.0.1]) by prod-mail-ppoint6.akamai.com (8.17.1.19/8.17.1.19) with ESMTP id 40TKtlhw002186; Mon, 29 Jan 2024 16:44:44 -0500 Received: from prod-mail-relay10.akamai.com ([172.27.118.251]) by prod-mail-ppoint6.akamai.com (PPS) with ESMTP id 3vvx22v8hj-1; Mon, 29 Jan 2024 16:44:44 -0500 Received: from bos-lhvcg5.bos01.corp.akamai.com (bos-lhvcg5.bos01.corp.akamai.com [172.28.221.19]) by prod-mail-relay10.akamai.com (Postfix) with ESMTP id 3C3E064407; Mon, 29 Jan 2024 21:44:44 +0000 (GMT) X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on lists.libvirt.org X-Spam-Level: X-Spam-Status: No, score=-1.5 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RCVD_IN_DNSWL_LOW,RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.4 From: mgalaxy@akamai.com To: devel@lists.libvirt.org Subject: [PATCH 3/5] Implement multiple memory backing paths Date: Mon, 29 Jan 2024 16:43:55 -0500 Message-Id: <20240129214357.1281805-4-mgalaxy@akamai.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20240129214357.1281805-1-mgalaxy@akamai.com> References: <20240129214357.1281805-1-mgalaxy@akamai.com> MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.1011,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2024-01-29_14,2024-01-29_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxscore=0 phishscore=0 bulkscore=0 spamscore=0 adultscore=0 suspectscore=0 mlxlogscore=999 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2311290000 definitions=main-2401290159 X-Proofpoint-GUID: gsrGLJKVJTeMyey7GakJ4apQNcdghkd_ X-Proofpoint-ORIG-GUID: gsrGLJKVJTeMyey7GakJ4apQNcdghkd_ X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.1011,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2024-01-29_14,2024-01-29_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 clxscore=1015 adultscore=0 mlxscore=0 spamscore=0 malwarescore=0 impostorscore=0 priorityscore=1501 phishscore=0 mlxlogscore=999 lowpriorityscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.19.0-2401190000 definitions=main-2401290160 Message-ID-Hash: 7MLP6TLJRHEAJZOSAGENYQETSTDMX37M X-Message-ID-Hash: 7MLP6TLJRHEAJZOSAGENYQETSTDMX37M X-MailFrom: mgalaxy@akamai.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-config-1; header-match-config-2; header-match-config-3; header-match-devel.lists.libvirt.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; suspicious-header CC: steven.sistare@oracle.com, johunt@akamai.com, cweng@akamai.com, bchaney@akamai.com, trakotoz@akamai.com, Michael Galaxy X-Mailman-Version: 3.2.2 Precedence: list List-Id: Development discussions about the libvirt library & tools Archived-At: List-Archive: List-Help: List-Post: List-Subscribe: List-Unsubscribe: Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-ZM-MESSAGEID: 1706567174646100001 From: Michael Galaxy We have different use cases: 1. The domain has multiple NUMA nodes, but they have only specified a single directory pathin qemu.conf (Original default behavior) 2. Domain has multiple NUMA nodes, but we have asked for multiple directory paths as well (new behavior). 3. Domain has single NUMA node, but we have asked for multiple directory paths (new behavior). Each one is elaborated more inline below in the comments. Signed-off-by: Michael Galaxy --- src/qemu/qemu_command.c | 8 +++- src/qemu/qemu_conf.c | 101 ++++++++++++++++++++++++++++++++++++---- src/qemu/qemu_conf.h | 11 +++-- 3 files changed, 106 insertions(+), 14 deletions(-) diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index b8f071ff2a..818e409d20 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -3448,7 +3448,9 @@ qemuBuildMemoryBackendProps(virJSONValue **backendPro= ps, } else { /* We can have both pagesize and mem source. If that's the cas= e, * prefer hugepages as those are more specific. */ - if (qemuGetMemoryBackingPath(priv->driver, def, mem->info.alia= s, &memPath) < 0) + + virDomainXMLPrivateDataCallbacks *privateData =3D (virDomainXM= LPrivateDataCallbacks *) priv; + if (qemuGetMemoryBackingPath(def, privateData, mem->targetNode= , mem->info.alias, &memPath) < 0) return -1; } =20 @@ -7291,7 +7293,9 @@ qemuBuildMemPathStr(const virDomainDef *def, return -1; prealloc =3D true; } else if (def->mem.source =3D=3D VIR_DOMAIN_MEMORY_SOURCE_FILE) { - if (qemuGetMemoryBackingPath(priv->driver, def, "ram", &mem_path) = < 0) + // This path should not be reached if NUMA is requested + virDomainXMLPrivateDataCallbacks *privateData =3D (virDomainXMLPri= vateDataCallbacks *) priv; + if (qemuGetMemoryBackingPath(def, privateData, 0, "ram", &mem_path= ) < 0) return -1; } =20 diff --git a/src/qemu/qemu_conf.c b/src/qemu/qemu_conf.c index aae9f316d8..e327a906a3 100644 --- a/src/qemu/qemu_conf.c +++ b/src/qemu/qemu_conf.c @@ -1622,22 +1622,106 @@ qemuGetDomainHupageMemPath(virQEMUDriver *driver, =20 =20 int -qemuGetMemoryBackingDomainPath(virQEMUDriver *driver, - const virDomainDef *def, +qemuGetMemoryBackingDomainPath(const virDomainDef *def, + virDomainXMLPrivateDataCallbacks *priv, + const size_t targetNode, char **path) { + qemuDomainObjPrivate *privateData =3D (qemuDomainObjPrivate *) priv; + virQEMUDriver *driver =3D privateData->driver; g_autoptr(virQEMUDriverConfig) cfg =3D virQEMUDriverGetConfig(driver); const char *root =3D driver->embeddedRoot; g_autofree char *shortName =3D NULL; + size_t path_index =3D 0; // original behavior, described below =20 if (!(shortName =3D virDomainDefGetShortName(def))) return -1; =20 - if (root && !STRPREFIX(cfg->memoryBackingDir, root)) { + /* + * We have three use cases: + * + * 1. Domain has multiple NUMA nodes, but they have only specified + * a single directory path in qemu.conf. (Original default behavior= ). + * + * In this case, we already placed the memory backing path for each= NUMA node + * into the same path location. Preserve the established default be= havior. + * + * 2. Domain has multiple NUMA nodes, but we have asked for multiple d= irectory + * paths as well. + * + * In this case, we will have a one-to-one relationship between the= number + * of NUMA nodes and the order in which the paths are provided. + * If the user does not specify enough paths, then we need to throw= an error. + * NOTE: This is open to comment. The "ordering" of the paths here = is not intially + * configurable to preserve backwards compatibility with the origin= al qemu.conf syntax. + * If controlling the ordering is desired, we would need to revise = the syntax in + * qemu.conf to make that possible. That hasn't been needed so far. + * + * NOTE A): We must check with numatune here, if requested. The num= ber of NUMA nodes + * may be less than or equal to the number of provided paths. If i= t is less, + * we have to respect the choices made by numatune. In this case, = we will map the + * physical NUMA nodes (0, 1, 2...) in the order in which the= y appear in qemu.conf + * + * 3. Domain has a single NUMA node, but we have asked for multiple di= rectory paths. + * + * In this case we also need to check if numatune is requested. If = so, + * we want to pick the path indicated by numatune. + * + * NOTE B): In both cases 2 and 3, if numatune is requested, the path = obviously cannot + * be changed on the fly, like it normally would be in "restri= ctive" mode + * during runtime. So, we will only do this is the mode reques= ted is "strict". + * + * NOTE C): Furthermore, in both cases 2 and 3, if the number of direc= tory paths provided + * is more than one, and one of either: a) no numatune is provided a= t all or + * b) numatune is in fact provided, but the mode is not stric= t, + * then we must thrown error. This is because we cannot know = which backing + * directory path to choose without the user's input. + * + * NOTE D): If one or more directory paths is requested in any of the = cases 1, 2, or 3, + * the numatune cannot specifiy more than one NUMA node, beca= use the only mode + * possible with directory paths is "strict" (e.g. automatic = numa balancing of + * memory will not work). Only one numa node can be requested= by numatune, else + * we must throw an error. + */ + + if (cfg->nb_memoryBackingDirs > 1) { + virDomainNuma *numatune =3D def->numa; + virBitmap *numaBitmap =3D virDomainNumatuneGetNodeset(numatune, priva= teData->autoNodeset, targetNode); + size_t numa_node_count =3D virDomainNumaGetNodeCount(def->numa); + virDomainNumatuneMemMode mode; + + if ((numatune && numaBitmap && virNumaNodesetIsAvailable(numaBitmap))= && + virDomainNumatuneGetMode(def->numa, -1, &mode) =3D=3D 0 && + mode =3D=3D VIR_DOMAIN_NUMATUNE_MEM_STRICT && + virBitmapCountBits(numaBitmap) =3D=3D 1) { + // Is numatune provided? + // Is it strict? + // Does it only specify a single pinning for this target? + // Yes to all 3? then good to go. + + if (cfg->nb_memoryBackingDirs < numa_node_count) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Domain requesting configuration for %lu NUMA nodes, but memor= y backing directory only has (%lu) directory paths available. Either reduce= this to one directory or provide more paths to use."), numa_node_count, cf= g->nb_memoryBackingDirs); + return -1; + } + + path_index =3D virBitmapNextSetBit(numaBitmap, -1); + } else if (numa_node_count > 1 && numa_node_count =3D=3D cfg->nb_memo= ryBackingDirs) { + // Be nice. A valid numatune and pinning has not been specified, but the= number + // of paths matches up exactly, so just assign them one-to-one. + path_index =3D targetNode; + } else { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("There are (%lu) memory directory directories configured. Doma= in must use a 'strict' numatune as well as an associated pinning configurat= ion for each NUMA node before proceeding. An individual NUMA node can only = be pinned to a single backing directory. Please correct the domain configur= ation or remove the memory backing directories and try again."), cfg->nb_me= moryBackingDirs); + return -1; + } + } + + if (root && !STRPREFIX(cfg->memoryBackingDirs[path_index], root)) { g_autofree char * hash =3D virDomainDriverGenerateRootHash("qemu",= root); - *path =3D g_strdup_printf("%s/%s-%s", cfg->memoryBackingDir, hash,= shortName); + *path =3D g_strdup_printf("%s/%s-%s", cfg->memoryBackingDirs[path_= index], hash, shortName); } else { - *path =3D g_strdup_printf("%s/%s", cfg->memoryBackingDir, shortNam= e); + *path =3D g_strdup_printf("%s/%s", cfg->memoryBackingDirs[path_ind= ex], shortName); } =20 return 0; @@ -1657,8 +1741,9 @@ qemuGetMemoryBackingDomainPath(virQEMUDriver *driver, * -1 otherwise (with error reported). */ int -qemuGetMemoryBackingPath(virQEMUDriver *driver, - const virDomainDef *def, +qemuGetMemoryBackingPath(const virDomainDef *def, + virDomainXMLPrivateDataCallbacks *priv, + const size_t targetNode, const char *alias, char **memPath) { @@ -1671,7 +1756,7 @@ qemuGetMemoryBackingPath(virQEMUDriver *driver, return -1; } =20 - if (qemuGetMemoryBackingDomainPath(driver, def, &domainPath) < 0) + if (qemuGetMemoryBackingDomainPath(def, priv, targetNode, &domainPath)= < 0) return -1; =20 *memPath =3D g_strdup_printf("%s/%s", domainPath, alias); diff --git a/src/qemu/qemu_conf.h b/src/qemu/qemu_conf.h index 2b8d540df0..4ae21524f7 100644 --- a/src/qemu/qemu_conf.h +++ b/src/qemu/qemu_conf.h @@ -370,11 +370,14 @@ int qemuGetDomainHupageMemPath(virQEMUDriver *driver, unsigned long long pagesize, char **memPath); =20 -int qemuGetMemoryBackingDomainPath(virQEMUDriver *driver, - const virDomainDef *def, +int qemuGetMemoryBackingDomainPath(const virDomainDef *def, + virDomainXMLPrivateDataCallbacks *priv, + const size_t targetNode, char **path); -int qemuGetMemoryBackingPath(virQEMUDriver *driver, - const virDomainDef *def, + +int qemuGetMemoryBackingPath(const virDomainDef *def, + virDomainXMLPrivateDataCallbacks *priv, + const size_t targetNode, const char *alias, char **memPath); =20 --=20 2.25.1 _______________________________________________ Devel mailing list -- devel@lists.libvirt.org To unsubscribe send an email to devel-leave@lists.libvirt.org