From nobody Sun Feb 8 04:30:34 2026 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; envelope-from=libvir-list-bounces@redhat.com; helo=mx1.redhat.com; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=libvir-list-bounces@redhat.com; dmarc=fail(p=none dis=none) header.from=intel.com Return-Path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by mx.zohomail.com with SMTPS id 1531902090968371.6614448143039; Wed, 18 Jul 2018 01:21:30 -0700 (PDT) Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 039AF83F46; Wed, 18 Jul 2018 08:21:29 +0000 (UTC) Received: from colo-mx.corp.redhat.com (colo-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.21]) by smtp.corp.redhat.com (Postfix) with ESMTPS id B4154100195E; Wed, 18 Jul 2018 08:21:28 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id 5A84124F64; Wed, 18 Jul 2018 08:21:28 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id w6I8LEi3030803 for ; Wed, 18 Jul 2018 04:21:14 -0400 Received: by smtp.corp.redhat.com (Postfix) id 820125D761; Wed, 18 Jul 2018 08:21:14 +0000 (UTC) Received: from mx1.redhat.com (ext-mx20.extmail.prod.ext.phx2.redhat.com [10.5.110.49]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 77B2A5D6A5 for ; Wed, 18 Jul 2018 08:21:10 +0000 (UTC) Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id B3578308625D for ; Wed, 18 Jul 2018 08:21:08 +0000 (UTC) Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 18 Jul 2018 01:21:05 -0700 Received: from bing-i9.bj.intel.com ([172.16.182.85]) by fmsmga006.fm.intel.com with ESMTP; 18 Jul 2018 01:20:52 -0700 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,369,1526367600"; d="scan'208";a="246666680" From: bing.niu@intel.com To: libvir-list@redhat.com Date: Wed, 18 Jul 2018 15:57:55 +0800 Message-Id: <1531900679-2756-6-git-send-email-bing.niu@intel.com> In-Reply-To: <1531900679-2756-1-git-send-email-bing.niu@intel.com> References: <1531900679-2756-1-git-send-email-bing.niu@intel.com> X-Greylist: Sender passed SPF test, Sender IP whitelisted by DNSRBL, ACL 207 matched, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.49]); Wed, 18 Jul 2018 08:21:08 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.49]); Wed, 18 Jul 2018 08:21:08 +0000 (UTC) for IP:'192.55.52.136' DOMAIN:'mga12.intel.com' HELO:'mga12.intel.com' FROM:'bing.niu@intel.com' RCPT:'' X-RedHat-Spam-Score: -2.301 (RCVD_IN_DNSWL_MED, SPF_PASS) 192.55.52.136 mga12.intel.com 192.55.52.136 mga12.intel.com X-Scanned-By: MIMEDefang 2.84 on 10.5.110.49 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-loop: libvir-list@redhat.com Cc: shaohe.feng@intel.com, huaqiang.wang@intel.com, Bing Niu , jian-feng.ding@intel.com, rui.zang@yandex.com Subject: [libvirt] [PATCH 5/9] util: Add MBA allocation to virresctrl X-BeenThere: libvir-list@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk List-Id: Development discussions about the libvirt library & tools List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: libvir-list-bounces@redhat.com Errors-To: libvir-list-bounces@redhat.com X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Wed, 18 Jul 2018 08:21:29 +0000 (UTC) X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Type: text/plain; charset="utf-8" From: Bing Niu Add memory bandwidth allocation support to virresctrl class. Introducing virResctrlAllocMemBW which is used for allocating memory bandwidth. Following virResctrlAllocPerType, it also employs a nested sparse array to indicate whether allocation is available for particular last level cache. Signed-off-by: Bing Niu --- src/util/virresctrl.c | 346 ++++++++++++++++++++++++++++++++++++++++++++++= ++-- src/util/virresctrl.h | 13 ++ 2 files changed, 346 insertions(+), 13 deletions(-) diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c index 06e2702..bec2afd 100644 --- a/src/util/virresctrl.c +++ b/src/util/virresctrl.c @@ -36,9 +36,9 @@ VIR_LOG_INIT("util.virresctrl") =20 =20 /* Resctrl is short for Resource Control. It might be implemented for var= ious - * resources, but at the time of this writing this is only supported for c= ache - * allocation technology (aka CAT). Hence the reson for leaving 'Cache' o= ut of - * all the structure and function names for now (can be added later if nee= ded. + * resources. Currently this supports cache allocation technology (aka CAT= ) and + * memory bandwidth allocation (aka MBA). More resources technologies may = be + * added in feature. */ =20 =20 @@ -89,6 +89,8 @@ typedef virResctrlAllocPerType *virResctrlAllocPerTypePtr; typedef struct _virResctrlAllocPerLevel virResctrlAllocPerLevel; typedef virResctrlAllocPerLevel *virResctrlAllocPerLevelPtr; =20 +typedef struct _virResctrlAllocMemBW virResctrlAllocMemBW; +typedef virResctrlAllocMemBW *virResctrlAllocMemBWPtr; =20 /* Class definitions and initializations */ static virClassPtr virResctrlInfoClass; @@ -181,7 +183,10 @@ virResctrlInfoDispose(void *obj) * consequently a directory under /sys/fs/resctrl). Since it can have mul= tiple * parts of multiple caches allocated it is represented as bunch of nested * sparse arrays (by sparse I mean array of pointers so that each might be= NULL - * in case there is no allocation for that particular one (level, cache, .= ..)). + * in case there is no allocation for that particular cache allocation (le= vel, + * cache, ...) or memory allocation for particular node). + * + * =3D=3D=3D=3D=3DCache allocation technology (CAT)=3D=3D=3D=3D=3D * * Since one allocation can be made for caches on different levels, the fi= rst * nested sparse array is of types virResctrlAllocPerLevel. For example i= f you @@ -206,6 +211,16 @@ virResctrlInfoDispose(void *obj) * all of them. While doing that we store the bitmask in a sparse array of * virBitmaps named `masks` indexed the same way as `sizes`. The upper bo= unds * of the sparse arrays are stored in nmasks or nsizes, respectively. + * + * =3D=3D=3D=3D=3DMemory Bandwidth allocation technology (MBA)=3D=3D=3D=3D= =3D + * + * The memory bandwidth allocation support in virResctrlAlloc works in the= same + * fashion as CAT. However, memory bandwidth controller doesn't have a hie= rarchy + * organization as cache, each node have one memory bandwidth controller to + * memory bandwidth distribution. The number of memory bandwidth controlle= r is + * identical with number of last level cache. So MBA also employs a sparse= array + * to represent whether a memory bandwidth allocation happens on correspon= ding node. + * The available memory controller number is collected in 'virResctrlInfo'. */ struct _virResctrlAllocPerType { /* There could be bool saying whether this is set or not, but since ev= erything @@ -226,12 +241,24 @@ struct _virResctrlAllocPerLevel { * VIR_CACHE_TYPE_LAST number of items */ }; =20 +/* + * virResctrlAllocMemBW represents one memory bandwidth allocation. Since = it can have + * several last level caches in a NUMA system, it is also represented as a= nested + * sparse arrays as virRestrlAllocPerLevel. + */ +struct _virResctrlAllocMemBW { + unsigned int **bandwidths; + size_t nbandwidths; +}; + struct _virResctrlAlloc { virObject parent; =20 virResctrlAllocPerLevelPtr *levels; size_t nlevels; =20 + virResctrlAllocMemBWPtr mem_bw; + /* The identifier (any unique string for now) */ char *id; /* libvirt-generated path in /sys/fs/resctrl for this particular @@ -275,6 +302,13 @@ virResctrlAllocDispose(void *obj) VIR_FREE(level); } =20 + if (alloc->mem_bw) { + virResctrlAllocMemBWPtr mem_bw =3D alloc->mem_bw; + for (i =3D 0; i < mem_bw->nbandwidths; i++) + VIR_FREE(mem_bw->bandwidths[i]); + } + + VIR_FREE(alloc->mem_bw); VIR_FREE(alloc->id); VIR_FREE(alloc->path); VIR_FREE(alloc->levels); @@ -697,6 +731,9 @@ virResctrlAllocIsEmpty(virResctrlAllocPtr alloc) if (!alloc) return true; =20 + if (alloc->mem_bw) + return false; + for (i =3D 0; i < alloc->nlevels; i++) { virResctrlAllocPerLevelPtr a_level =3D alloc->levels[i]; =20 @@ -890,6 +927,27 @@ virResctrlAllocSetCacheSize(virResctrlAllocPtr alloc, =20 =20 int +virResctrlAllocForeachMemory(virResctrlAllocPtr alloc, + virResctrlAllocForeachMemoryCallback cb, + void *opaque) +{ + size_t i =3D 0; + + if (!alloc) + return 0; + + if (alloc->mem_bw) { + virResctrlAllocMemBWPtr mem_bw =3D alloc->mem_bw; + for (i =3D 0; i < mem_bw->nbandwidths; i++) + if (mem_bw->bandwidths[i]) + cb(i, *mem_bw->bandwidths[i], opaque); + } + + return 0; +} + + +int virResctrlAllocForeachCache(virResctrlAllocPtr alloc, virResctrlAllocForeachCacheCallback cb, void *opaque) @@ -952,6 +1010,240 @@ virResctrlAllocGetID(virResctrlAllocPtr alloc) } =20 =20 +static void +virResctrlMemoryBandwidthSubtract(virResctrlAllocPtr free, + virResctrlAllocPtr used) +{ + size_t i; + + if (!used->mem_bw) + return; + + for (i =3D 0; i < used->mem_bw->nbandwidths; i++) { + if (used->mem_bw->bandwidths[i]) + *(free->mem_bw->bandwidths[i]) -=3D *(used->mem_bw->bandwidths= [i]); + } +} + + +int +virResctrlSetMemoryBandwidth(virResctrlAllocPtr alloc, + unsigned int id, + unsigned int memory_bandwidth) +{ + virResctrlAllocMemBWPtr mem_bw =3D alloc->mem_bw; + + if (!mem_bw) { + if (VIR_ALLOC(mem_bw) < 0) + return -1; + alloc->mem_bw =3D mem_bw; + } + + if (mem_bw->nbandwidths <=3D id && + VIR_EXPAND_N(mem_bw->bandwidths, mem_bw->nbandwidths, + id - mem_bw->nbandwidths + 1) < 0) + return -1; + + if (mem_bw->bandwidths[id]) { + virReportError(VIR_ERR_XML_ERROR, + _("Memory Bandwidth already defined for node %u"), + id); + return -1; + } + + if (VIR_ALLOC(mem_bw->bandwidths[id]) < 0) + return -1; + + *(mem_bw->bandwidths[id]) =3D memory_bandwidth; + return 0; +} + + +static int +virResctrlAllocMemoryBandwidthFormat(virResctrlAllocPtr alloc, + virBufferPtr buf) +{ + size_t i; + + if (!alloc->mem_bw) + return 0; + + virBufferAddLit(buf, "MB:"); + + for (i =3D 0; i < alloc->mem_bw->nbandwidths; i++) { + if (alloc->mem_bw->bandwidths[i]) { + virBufferAsprintf(buf, "%zd=3D%u;", i, + *(alloc->mem_bw->bandwidths[i])); + } + } + + virBufferTrim(buf, ";", 1); + virBufferAddChar(buf, '\n'); + if (virBufferCheckError(buf) < 0) + return -1; + else + return 0; +} + + +static int +virResctrlAllocMemoryBandwidth(virResctrlInfoPtr resctrl, + virResctrlAllocPtr alloc, + virResctrlAllocPtr free) +{ + size_t i; + virResctrlAllocMemBWPtr mem_bw_alloc =3D alloc->mem_bw; + virResctrlAllocMemBWPtr mem_bw_free =3D free->mem_bw; + virResctrlInfoMemBWPtr mem_bw_info =3D resctrl->membw_info; + + if (!mem_bw_alloc) + return 0; + + if (mem_bw_alloc && !mem_bw_info) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("RDT Memory Bandwidth allocation " + "unsupported")); + return -1; + } + + for (i =3D 0; i < mem_bw_alloc->nbandwidths; i++) { + if (!mem_bw_alloc->bandwidths[i]) + continue; + + if (*(mem_bw_alloc->bandwidths[i]) % mem_bw_info->bandwidth_granul= arity) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Memory Bandwidth allocation of size " + "%u is not divisible by granularity %u"), + *(mem_bw_alloc->bandwidths[i]), + mem_bw_info->bandwidth_granularity); + return -1; + } + if (*(mem_bw_alloc->bandwidths[i]) < mem_bw_info->min_bandwidth) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Memory Bandwidth allocation of size " + "%u is smaller than the minimum " + "allowed allocation %u"), + *(mem_bw_alloc->bandwidths[i]), + mem_bw_info->min_bandwidth); + return -1; + } + if (i > mem_bw_info->max_id) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("bandwidth controller %zd not exist, " + "max controller id %u"), + i, mem_bw_info->max_id); + return -1; + } + if (*(mem_bw_alloc->bandwidths[i]) > *(mem_bw_free->bandwidths[i])= ) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("Not enough room for allocation of %u%% " + "bandwidth on node %zd, available bandwidth %= u%%"), + *(mem_bw_alloc->bandwidths[i]), i, + *(mem_bw_free->bandwidths[i])); + return -1; + } + } + return 0; +} + + +static int +virResctrlAllocParseProcessMemoryBandwidth(virResctrlInfoPtr resctrl, + virResctrlAllocPtr alloc, + char *mem_bw) +{ + unsigned int bandwidth; + unsigned int id; + char *tmp =3D NULL; + + tmp =3D strchr(mem_bw, '=3D'); + if (!tmp) + return 0; + *tmp =3D '\0'; + tmp++; + + if (virStrToLong_uip(mem_bw, NULL, 10, &id) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Invalid node id %u "), id); + return -1; + } + if (virStrToLong_uip(tmp, NULL, 10, &bandwidth) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Invalid bandwidth %u"), bandwidth); + return -1; + } + if (bandwidth < resctrl->membw_info->min_bandwidth || + id > resctrl->membw_info->max_id) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Missing or inconsistent resctrl info for " + "memory bandwidth node '%u'"), id); + return -1; + } + if (alloc->mem_bw->nbandwidths <=3D id && + VIR_EXPAND_N(alloc->mem_bw->bandwidths, alloc->mem_bw->nbandwidths, + id - alloc->mem_bw->nbandwidths + 1) < 0) { + return -1; + } + if (!alloc->mem_bw->bandwidths[id]) { + if (VIR_ALLOC(alloc->mem_bw->bandwidths[id]) < 0) + return -1; + } + + *(alloc->mem_bw->bandwidths[id]) =3D bandwidth; + return 0; +} + + +static int +virResctrlAllocParseMemoryBandwidthLine(virResctrlInfoPtr resctrl, + virResctrlAllocPtr alloc, + char *line) +{ + char **mbs =3D NULL; + char *tmp =3D NULL; + size_t nmbs =3D 0; + size_t i; + int ret =3D -1; + + /* For no reason there can be spaces */ + virSkipSpaces((const char **) &line); + + if (STRNEQLEN(line, "MB", 2)) + return 0; + + if (!resctrl || !resctrl->membw_info || + !resctrl->membw_info->min_bandwidth || + !resctrl->membw_info->bandwidth_granularity) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Missing or inconsistent resctrl info for " + "memory bandwidth allocation")); + } + + if (!alloc->mem_bw) { + if (VIR_ALLOC(alloc->mem_bw) < 0) + return -1; + } + + tmp =3D strchr(line, ':'); + if (!tmp) + return 0; + tmp++; + + mbs =3D virStringSplitCount(tmp, ";", 0, &nmbs); + if (nmbs =3D=3D 0) + return 0; + + for (i =3D 0; i < nmbs; i++) { + if (virResctrlAllocParseProcessMemoryBandwidth(resctrl, alloc, mbs= [i]) < 0) + goto cleanup; + } + ret =3D 0; + cleanup: + virStringListFree(mbs); + return ret; +} + + static int virResctrlAllocFormatCache(virResctrlAllocPtr alloc, virBufferPtr buf) { @@ -1013,6 +1305,11 @@ virResctrlAllocFormat(virResctrlAllocPtr alloc) return NULL; } =20 + if (virResctrlAllocMemoryBandwidthFormat(alloc, &buf) < 0) { + virBufferFreeAndReset(&buf); + return NULL; + } + return virBufferContentAndReset(&buf); } =20 @@ -1139,6 +1436,8 @@ virResctrlAllocParse(virResctrlInfoPtr resctrl, =20 lines =3D virStringSplitCount(schemata, "\n", 0, &nlines); for (i =3D 0; i < nlines; i++) { + if (virResctrlAllocParseMemoryBandwidthLine(resctrl, alloc, lines[= i]) < 0) + goto cleanup; if (virResctrlAllocParseCacheLine(resctrl, alloc, lines[i]) < 0) goto cleanup; } @@ -1273,6 +1572,22 @@ virResctrlAllocNewFromInfo(virResctrlInfoPtr info) } } =20 + /* set default free memory bandwidth to 100%*/ + if (info->membw_info) { + if (VIR_ALLOC(ret->mem_bw) < 0) + goto error; + + if (VIR_EXPAND_N(ret->mem_bw->bandwidths, ret->mem_bw->nbandwidths, + info->membw_info->max_id + 1) < 0) + goto error; + + for (i =3D 0; i < ret->mem_bw->nbandwidths; i++) { + if (VIR_ALLOC(ret->mem_bw->bandwidths[i]) < 0) + goto error; + *(ret->mem_bw->bandwidths[i]) =3D 100; + } + } + cleanup: virBitmapFree(mask); return ret; @@ -1284,13 +1599,14 @@ virResctrlAllocNewFromInfo(virResctrlInfoPtr info) =20 /* * This function creates an allocation that represents all unused parts of= all - * caches in the system. It uses virResctrlInfo for creating a new full - * allocation with all bits set (using virResctrlAllocNewFromInfo()) and t= hen - * scans for all allocations under /sys/fs/resctrl and subtracts each one = of - * them from it. That way it can then return an allocation with only bit = set - * being those that are not mentioned in any other allocation. It is used= for - * two things, a) calculating the masks when creating allocations and b) f= rom - * tests. + * caches and memory bandwidth in the system. It uses virResctrlInfo for + * creating a new full allocation with all bits set (using + * virResctrlAllocNewFromInfo()), memory bandwidth 100% and then scans + * for all allocations under /sys/fs/resctrl and subtracts each one of them + * from it. That way it can then return an allocation with only bit set + * being those that are not mentioned in any other allocation for CAT and + * available memory bandwidth for MBA. It is used for two things, a) calc= ulating + * the masks and bandwidth available when creating allocations and b) from= tests. */ virResctrlAllocPtr virResctrlAllocGetUnused(virResctrlInfoPtr resctrl) @@ -1336,6 +1652,7 @@ virResctrlAllocGetUnused(virResctrlInfoPtr resctrl) goto error; } =20 + virResctrlMemoryBandwidthSubtract(ret, alloc); virResctrlAllocSubtract(ret, alloc); virObjectUnref(alloc); alloc =3D NULL; @@ -1526,8 +1843,8 @@ virResctrlAllocCopyMasks(virResctrlAllocPtr dst, =20 /* * This function is called when creating an allocation in the system. Wha= t it - * does is that it gets all the unused bits using virResctrlAllocGetUnused= () and - * then tries to find a proper space for every requested allocation effect= ively + * does is that it gets all the unused resources using virResctrlAllocGetU= nused() + * and then tries to find a proper space for every requested allocation ef= fectively * transforming `sizes` into `masks`. */ static int @@ -1547,6 +1864,9 @@ virResctrlAllocAssign(virResctrlInfoPtr resctrl, if (!alloc_default) goto cleanup; =20 + if (virResctrlAllocMemoryBandwidth(resctrl, alloc, alloc_free) < 0) + goto cleanup; + if (virResctrlAllocCopyMasks(alloc, alloc_default) < 0) goto cleanup; =20 diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h index d657c06..d43fd31 100644 --- a/src/util/virresctrl.h +++ b/src/util/virresctrl.h @@ -73,6 +73,10 @@ typedef int virResctrlAllocForeachCacheCallback(unsigned= int level, unsigned long long size, void *opaque); =20 +typedef int virResctrlAllocForeachMemoryCallback(unsigned int id, + unsigned int size, + void *opaque); + virResctrlAllocPtr virResctrlAllocNew(void); =20 @@ -85,6 +89,15 @@ virResctrlAllocSetCacheSize(virResctrlAllocPtr alloc, virCacheType type, unsigned int cache, unsigned long long size); +int +virResctrlAllocForeachMemory(virResctrlAllocPtr resctrl, + virResctrlAllocForeachMemoryCallback cb, + void *opaque); + +int +virResctrlSetMemoryBandwidth(virResctrlAllocPtr alloc, + unsigned int id, + unsigned int memory_bandwidth); =20 int virResctrlAllocForeachCache(virResctrlAllocPtr alloc, --=20 2.7.4 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list