From nobody Sun Feb 8 14:12:00 2026 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of redhat.com designates 207.211.31.120 as permitted sender) client-ip=207.211.31.120; envelope-from=libvir-list-bounces@redhat.com; helo=us-smtp-1.mimecast.com; Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of redhat.com designates 207.211.31.120 as permitted sender) smtp.mailfrom=libvir-list-bounces@redhat.com; dmarc=fail(p=none dis=none) header.from=gmail.com ARC-Seal: i=1; a=rsa-sha256; t=1582554325; cv=none; d=zohomail.com; s=zohoarc; b=UAbqAhN8Uj/zUwDDYsgrYx4vAtyUufkcJ2gFI12w54clEIS6d6Gqjdq9llWVDKmpEughK6vSf8k2m66hpTOgKATEm2y8oGENPweQC8CvpfZuuEPAeDKYz5lVLEVXezVGK43HzFUbTLjVP5AceY3/cIrlt2pJFQRassNghVaIkcc= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1582554325; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=9b2bSIq6yA9vV7YIO26zHanD6D+KuhJGnHFYH7Bm1pQ=; b=SIB0en8NWmodyDovnk3HKNcfqF0VkN2y/BU6qQ6PjMOK9+hnPkRtncEkJp5INH2iFZGgzsl4LIwsvKwiLVfQTiZyfnyohtPBuSVJLluocLOTyLRG2EE5EdO3/kxnD3ZqlRqcnXmgQ4m//m0Pm2LgWeIHgUCOJfS7f81g6EnwF7E= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of redhat.com designates 207.211.31.120 as permitted sender) smtp.mailfrom=libvir-list-bounces@redhat.com; dmarc=fail header.from= (p=none dis=none) header.from= Return-Path: Received: from us-smtp-1.mimecast.com (us-smtp-delivery-1.mimecast.com [207.211.31.120]) by mx.zohomail.com with SMTPS id 1582554325693623.4629333719704; Mon, 24 Feb 2020 06:25:25 -0800 (PST) Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-129-JDZRHS5jMyqNF9xTb7Xdxg-1; Mon, 24 Feb 2020 09:25:22 -0500 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 396528024ED; Mon, 24 Feb 2020 14:25:14 +0000 (UTC) Received: from colo-mx.corp.redhat.com (colo-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.21]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 08B859009D; Mon, 24 Feb 2020 14:25:14 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id 9B11F860E9; Mon, 24 Feb 2020 14:25:13 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id 01OEPBNW023404 for ; Mon, 24 Feb 2020 09:25:11 -0500 Received: by smtp.corp.redhat.com (Postfix) id F01759D44; Mon, 24 Feb 2020 14:25:10 +0000 (UTC) Received: from mimecast-mx02.redhat.com (mimecast02.extmail.prod.ext.rdu2.redhat.com [10.11.55.18]) by smtp.corp.redhat.com (Postfix) with ESMTPS id EB369DBF37 for ; Mon, 24 Feb 2020 14:25:10 +0000 (UTC) Received: from us-smtp-1.mimecast.com (us-smtp-delivery-1.mimecast.com [207.211.31.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id CA9B18007AC for ; Mon, 24 Feb 2020 14:25:10 +0000 (UTC) Received: from mail-qk1-f194.google.com (mail-qk1-f194.google.com [209.85.222.194]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-115-1c_t7SK5MvOL75vqu2J0Jw-1; Mon, 24 Feb 2020 09:25:08 -0500 Received: by mail-qk1-f194.google.com with SMTP id p7so8777472qkh.10 for ; Mon, 24 Feb 2020 06:25:08 -0800 (PST) Received: from localhost.localdomain (201-69-8-70.dial-up.telesp.net.br. [201.69.8.70]) by smtp.gmail.com with ESMTPSA id o16sm6092364qtr.28.2020.02.24.06.25.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 Feb 2020 06:25:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1582554324; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post; bh=9b2bSIq6yA9vV7YIO26zHanD6D+KuhJGnHFYH7Bm1pQ=; b=hMMA4BEGQayt7BWtCjmz5YVVq+5df0E7YB2gLhYHhdRBV6701/k/SKIfhuz3O3JZcIEKsa O3Nz/Ruqm+15lG6BkhvrHUbPS72x5ZVG0TFgay9PIipPgEeF0lbO+vREuwEDh/tlyVAPqo 38ysAnnz1gBgUm7JSJuMl2Q5Q81/vqI= X-MC-Unique: JDZRHS5jMyqNF9xTb7Xdxg-1 X-MC-Unique: 1c_t7SK5MvOL75vqu2J0Jw-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=sfVNFJSYgCBsrNaWIIxxvv7YjVueH7/oa9otHcbPrMI=; b=TDmVmJ7B3TopU+8pUXC+to6XfrraAGlPx+J1wohTgvWFazbt8eEXcgl7+VvX2lsccQ b/zbHQNxOOx/QVnNl5o9JY6qC/2LEt+mKIU/YArSGA2mC1K4EQWokLO1jJ1tvvyd1XPo 1EUUYM64dXYSOhNBNCx1QtqHO/C9K1E4majQDbozGJyQqz0QkKUGKNy3cZPnpxvO0j9P eEfLDy2r8FsAf91I4rMkG2cUlV+MCJzzO3Jj6hPZ1I4lbWX751fSBH8x30vihoNRwFhJ CKEyWdEzMbkncDfPv5ARnbRaMmUNcx0Vbsg2ptEd7mspm8rgaDm5XU5iLUnM70d/yQA2 YTtQ== X-Gm-Message-State: APjAAAW8DOOYzM2lfutoj0CToMBsfP5fxrIl6YlJpVpZUf8FCnTolb3r 4v5RfxNuuZhIcNlbxEq9FK7BCow3 X-Google-Smtp-Source: APXvYqyIGoODZDlP4dOMcnCUidl2J+mOPTTKtlLiSQcJc4oYOa/2lsglA+4G6xtBTFyXr2pp3TSViA== X-Received: by 2002:a05:620a:5e9:: with SMTP id z9mr8305881qkg.255.1582554307225; Mon, 24 Feb 2020 06:25:07 -0800 (PST) From: Julio Faracco To: libvir-list@redhat.com Subject: [PATCH v4 4/5] lxc: Implement virtual /proc/cpuinfo via LXC fuse Date: Mon, 24 Feb 2020 11:24:28 -0300 Message-Id: <20200224142428.538-5-jcfaracco@gmail.com> In-Reply-To: <20200224142428.538-1-jcfaracco@gmail.com> References: <20200224142428.538-1-jcfaracco@gmail.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.11.54.5 X-MIME-Autoconverted: from quoted-printable to 8bit by lists01.pubmisc.prod.ext.phx2.redhat.com id 01OEPBNW023404 X-loop: libvir-list@redhat.com Cc: danielhb413@gmail.com X-BeenThere: libvir-list@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk List-Id: Development discussions about the libvirt library & tools List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: libvir-list-bounces@redhat.com Errors-To: libvir-list-bounces@redhat.com X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable X-ZohoMail-DKIM: pass (identity @redhat.com) Content-Type: text/plain; charset="utf-8" This commit tries to fix lots of issues related to LXC VCPUs. One of them is related to /proc/cpuinfo content. If only 1 VCPU is set, LXC containers will show all CPUs available for host. The second one is related to CPU share, if an user set only 1 VCPU, the container/process will use all available CPUs. (This is not the case when `cpuset` attribute is declared). So, this commit adds a virtual cpuinfo based on VCPU mapping and it automatically limits the CPU usage according VCPU count. Example (now): LXC container - 8 CPUS with 2 VCPU: lxc-root# stress --cpu 8 On host machine, only CPU 0 and 1 have 100% usage. Signed-off-by: Julio Faracco --- src/lxc/lxc_cgroup.c | 31 ++++++++++++++ src/lxc/lxc_container.c | 39 ++++++++++------- src/lxc/lxc_fuse.c | 95 ++++++++++++++++++++++++++++++++++++++--- 3 files changed, 145 insertions(+), 20 deletions(-) diff --git a/src/lxc/lxc_cgroup.c b/src/lxc/lxc_cgroup.c index d29b65092a..912a252473 100644 --- a/src/lxc/lxc_cgroup.c +++ b/src/lxc/lxc_cgroup.c @@ -50,6 +50,34 @@ static int virLXCCgroupSetupCpuTune(virDomainDefPtr def, } =20 =20 +static int virLXCCgroupSetupVcpuAuto(virDomainDefPtr def, + virCgroupPtr cgroup) +{ + size_t i; + int vcpumax; + virBuffer buffer =3D VIR_BUFFER_INITIALIZER; + virBufferPtr cpuset =3D &buffer; + + vcpumax =3D virDomainDefGetVcpusMax(def); + for (i =3D 0; i < vcpumax; i++) { + virDomainVcpuDefPtr vcpu =3D virDomainDefGetVcpu(def, i); + /* Cgroup is smart enough to convert numbers separated + * by comma into ranges. Example: "0,1,2,5," -> "0-2,5". + * Libvirt does not need to process it here. */ + if (vcpu) + virBufferAsprintf(cpuset, "%zu,", i); + } + if (virCgroupSetCpusetCpus(cgroup, + virBufferCurrentContent(cpuset)) < 0) { + virBufferFreeAndReset(cpuset); + return -1; + } + + virBufferFreeAndReset(cpuset); + return 0; +} + + static int virLXCCgroupSetupCpusetTune(virDomainDefPtr def, virCgroupPtr cgroup, virBitmapPtr nodemask) @@ -61,6 +89,9 @@ static int virLXCCgroupSetupCpusetTune(virDomainDefPtr de= f, def->cpumask && virCgroupSetupCpusetCpus(cgroup, def->cpumask) < 0) { return -1; + } else { + /* auto mode for VCPU limits */ + virLXCCgroupSetupVcpuAuto(def, cgroup); } =20 if (virDomainNumatuneGetMode(def->numa, -1, &mode) < 0 || diff --git a/src/lxc/lxc_container.c b/src/lxc/lxc_container.c index 41efe43a14..88e27f3060 100644 --- a/src/lxc/lxc_container.c +++ b/src/lxc/lxc_container.c @@ -997,8 +997,8 @@ static int lxcContainerMountBasicFS(bool userns_enabled, static int lxcContainerMountProcFuse(virDomainDefPtr def, const char *stateDir) { - int ret; - char *meminfo_path =3D NULL; + g_autofree char *meminfo_path =3D NULL; + g_autofree char *cpuinfo_path =3D NULL; =20 VIR_DEBUG("Mount /proc/meminfo stateDir=3D%s", stateDir); =20 @@ -1006,15 +1006,29 @@ static int lxcContainerMountProcFuse(virDomainDefPt= r def, stateDir, def->name); =20 - if ((ret =3D mount(meminfo_path, "/proc/meminfo", - NULL, MS_BIND, NULL)) < 0) { + if (mount(meminfo_path, "/proc/meminfo", + NULL, MS_BIND, NULL) < 0) { virReportSystemError(errno, _("Failed to mount %s on /proc/meminfo"), meminfo_path); + return -1; } =20 - VIR_FREE(meminfo_path); - return ret; + VIR_DEBUG("Mount /proc/cpuinfo stateDir=3D%s", stateDir); + + cpuinfo_path =3D g_strdup_printf("/.oldroot/%s/%s.fuse/cpuinfo", + stateDir, + def->name); + + if (mount(cpuinfo_path, "/proc/cpuinfo", + NULL, MS_BIND, NULL) < 0) { + virReportSystemError(errno, + _("Failed to mount %s on /proc/cpuinfo"), + cpuinfo_path); + return -1; + } + + return 0; } #else static int lxcContainerMountProcFuse(virDomainDefPtr def G_GNUC_UNUSED, @@ -1027,8 +1041,7 @@ static int lxcContainerMountProcFuse(virDomainDefPtr = def G_GNUC_UNUSED, static int lxcContainerMountFSDev(virDomainDefPtr def, const char *stateDir) { - int ret =3D -1; - char *path =3D NULL; + g_autofree char *path =3D NULL; int flags =3D def->idmap.nuidmap ? MS_BIND : MS_MOVE; =20 VIR_DEBUG("Mount /dev/ stateDir=3D%s", stateDir); @@ -1038,7 +1051,7 @@ static int lxcContainerMountFSDev(virDomainDefPtr def, if (virFileMakePath("/dev") < 0) { virReportSystemError(errno, "%s", _("Cannot create /dev")); - goto cleanup; + return -1; } =20 VIR_DEBUG("Trying to %s %s to /dev", def->idmap.nuidmap ? @@ -1048,14 +1061,10 @@ static int lxcContainerMountFSDev(virDomainDefPtr d= ef, virReportSystemError(errno, _("Failed to mount %s on /dev"), path); - goto cleanup; + return -1; } =20 - ret =3D 0; - - cleanup: - VIR_FREE(path); - return ret; + return 0; } =20 static int lxcContainerMountFSDevPTS(virDomainDefPtr def, diff --git a/src/lxc/lxc_fuse.c b/src/lxc/lxc_fuse.c index 8cfccdd7e0..b2117bfa17 100644 --- a/src/lxc/lxc_fuse.c +++ b/src/lxc/lxc_fuse.c @@ -36,23 +36,29 @@ =20 #if WITH_FUSE =20 +#ifndef CPUINFO_FILE_LEN +# define CPUINFO_FILE_LEN (1024*1024) +#endif + static const char *fuse_meminfo_path =3D "/meminfo"; +static const char *fuse_cpuinfo_path =3D "/cpuinfo"; =20 static int lxcProcGetattr(const char *path, struct stat *stbuf) { - g_autofree char *mempath =3D NULL; + g_autofree char *procpath =3D NULL; struct stat sb; struct fuse_context *context =3D fuse_get_context(); virDomainDefPtr def =3D (virDomainDefPtr)context->private_data; =20 memset(stbuf, 0, sizeof(struct stat)); - mempath =3D g_strdup_printf("/proc/%s", path); + procpath =3D g_strdup_printf("/proc/%s", path); =20 if (STREQ(path, "/")) { stbuf->st_mode =3D S_IFDIR | 0755; stbuf->st_nlink =3D 2; - } else if (STREQ(path, fuse_meminfo_path)) { - if (stat(mempath, &sb) < 0) + } else if (STREQ(path, fuse_meminfo_path) || + STREQ(path, fuse_cpuinfo_path)) { + if (stat(procpath, &sb) < 0) return -errno; =20 stbuf->st_uid =3D def->idmap.uidmap ? def->idmap.uidmap[0].target = : 0; @@ -83,6 +89,7 @@ static int lxcProcReaddir(const char *path, void *buf, filler(buf, ".", NULL, 0); filler(buf, "..", NULL, 0); filler(buf, fuse_meminfo_path + 1, NULL, 0); + filler(buf, fuse_cpuinfo_path + 1, NULL, 0); =20 return 0; } @@ -90,7 +97,8 @@ static int lxcProcReaddir(const char *path, void *buf, static int lxcProcOpen(const char *path G_GNUC_UNUSED, struct fuse_file_info *fi G_GNUC_UNUSED) { - if (STRNEQ(path, fuse_meminfo_path)) + if (STRNEQ(path, fuse_meminfo_path) && + STRNEQ(path, fuse_cpuinfo_path)) return -ENOENT; =20 if ((fi->flags & 3) !=3D O_RDONLY) @@ -227,6 +235,80 @@ static int lxcProcReadMeminfo(char *hostpath, virDomai= nDefPtr def, return res; } =20 + +static int +lxcProcReadCpuinfoParse(virDomainDefPtr def, char *base, + virBufferPtr new_cpuinfo) +{ + char *procline =3D NULL; + char *saveptr =3D base; + size_t cpu; + size_t nvcpu; + size_t curcpu =3D 0; + bool get_proc =3D false; + + nvcpu =3D virDomainDefGetVcpus(def); + while ((procline =3D strtok_r(NULL, "\n", &saveptr))) { + if (sscanf(procline, "processor\t: %zu", &cpu) =3D=3D 1) { + virDomainVcpuDefPtr vcpu =3D virDomainDefGetVcpu(def, cpu); + /* VCPU is mapped */ + if (vcpu) { + if (curcpu =3D=3D nvcpu) + break; + + if (curcpu > 0) + virBufferAddLit(new_cpuinfo, "\n"); + + virBufferAsprintf(new_cpuinfo, "processor\t: %zu\n", + curcpu); + curcpu++; + get_proc =3D true; + } else { + get_proc =3D false; + } + } else { + /* It is not a processor index */ + if (get_proc) + virBufferAsprintf(new_cpuinfo, "%s\n", procline); + } + } + + virBufferAddLit(new_cpuinfo, "\n"); + + return strlen(virBufferCurrentContent(new_cpuinfo)); +} + + +static int lxcProcReadCpuinfo(char *hostpath, virDomainDefPtr def, + char *buf, size_t size, off_t offset) +{ + virBuffer buffer =3D VIR_BUFFER_INITIALIZER; + virBufferPtr new_cpuinfo =3D &buffer; + g_autofree char *outbuf =3D NULL; + int res =3D -1; + + /* Gather info from /proc/cpuinfo */ + if (virFileReadAll(hostpath, CPUINFO_FILE_LEN, &outbuf) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Failed to open %s"), hostpath); + return -1; + } + + /* /proc/cpuinfo does not support fseek */ + if (offset > 0) + return 0; + + res =3D lxcProcReadCpuinfoParse(def, outbuf, new_cpuinfo); + + if (res > size) + res =3D size; + memcpy(buf, virBufferCurrentContent(new_cpuinfo), res); + + virBufferFreeAndReset(new_cpuinfo); + return res; +} + + static int lxcProcRead(const char *path G_GNUC_UNUSED, char *buf G_GNUC_UNUSED, size_t size G_GNUC_UNUSED, @@ -246,6 +328,9 @@ static int lxcProcRead(const char *path G_GNUC_UNUSED, if (STREQ(path, fuse_meminfo_path)) { if ((res =3D lxcProcReadMeminfo(hostpath, def, buf, size, offset))= < 0) res =3D lxcProcHostRead(hostpath, buf, size, offset); + } else if (STREQ(path, fuse_cpuinfo_path)) { + if ((res =3D lxcProcReadCpuinfo(hostpath, def, buf, size, offset))= < 0) + res =3D lxcProcHostRead(hostpath, buf, size, offset); } =20 return res; --=20 2.20.1