From nobody Sat Feb 7 13:41:36 2026 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of redhat.com designates 170.10.133.124 as permitted sender) client-ip=170.10.133.124; envelope-from=libvir-list-bounces@redhat.com; helo=us-smtp-delivery-124.mimecast.com; Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=libvir-list-bounces@redhat.com; dmarc=fail(p=none dis=none) header.from=linux.microsoft.com Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mx.zohomail.com with SMTPS id 1641941056630502.83613993951496; Tue, 11 Jan 2022 14:44:16 -0800 (PST) Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-372--fu16VrwPBq1iW4q2JQ8FQ-1; Tue, 11 Jan 2022 17:44:13 -0500 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 1C6BA192AB6E; Tue, 11 Jan 2022 22:44:09 +0000 (UTC) Received: from colo-mx.corp.redhat.com (colo-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.21]) by smtp.corp.redhat.com (Postfix) with ESMTPS id E99AE55F62; Tue, 11 Jan 2022 22:44:08 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id A45FE4A7CA; Tue, 11 Jan 2022 22:44:08 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id 20BMhcgj004797 for ; Tue, 11 Jan 2022 17:43:38 -0500 Received: by smtp.corp.redhat.com (Postfix) id 83AEB46D20E; Tue, 11 Jan 2022 22:43:38 +0000 (UTC) Received: from mimecast-mx02.redhat.com (mimecast07.extmail.prod.ext.rdu2.redhat.com [10.11.55.23]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 7F06646D1FC for ; Tue, 11 Jan 2022 22:43:38 +0000 (UTC) Received: from us-smtp-1.mimecast.com (us-smtp-1.mimecast.com [207.211.31.81]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 5C9E83C021AC for ; Tue, 11 Jan 2022 22:43:38 +0000 (UTC) Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by relay.mimecast.com with ESMTP id us-mta-153-6LQpNMKfOjC0QR-SawNySg-1; Tue, 11 Jan 2022 17:43:36 -0500 Received: from prapal-ch2.oiwy50ateaxezb1sqsoezlib2f.xx.internal.cloudapp.net (unknown [20.80.162.67]) by linux.microsoft.com (Postfix) with ESMTPSA id 3DB2F20B717B; Tue, 11 Jan 2022 14:43:35 -0800 (PST) X-MC-Unique: -fu16VrwPBq1iW4q2JQ8FQ-1 X-MC-Unique: 6LQpNMKfOjC0QR-SawNySg-1 DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com 3DB2F20B717B From: Praveen K Paladugu To: libvir-list@redhat.com Subject: [libvirt PATCH v4 2/7] ch: methods for cgroup mgmt in ch driver Date: Tue, 11 Jan 2022 22:43:24 +0000 Message-Id: <20220111224329.2611962-3-prapal@linux.microsoft.com> In-Reply-To: <20220111224329.2611962-1-prapal@linux.microsoft.com> References: <20220111224329.2611962-1-prapal@linux.microsoft.com> MIME-Version: 1.0 X-Mimecast-Impersonation-Protect: Policy=CLT - Impersonation Protection Definition; Similar Internal Domain=false; Similar Monitored External Domain=false; Custom External Domain=false; Mimecast External Domain=false; Newly Observed Domain=false; Internal User Name=false; Custom Display Name List=false; Reply-to Address Mismatch=false; Targeted Threat Dictionary=false; Mimecast Threat Dictionary=false; Custom Threat Dictionary=false X-Scanned-By: MIMEDefang 2.85 on 10.11.54.10 X-MIME-Autoconverted: from quoted-printable to 8bit by lists01.pubmisc.prod.ext.phx2.redhat.com id 20BMhcgj004797 X-loop: libvir-list@redhat.com Cc: william.douglas@intel.com X-BeenThere: libvir-list@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk List-Id: Development discussions about the libvirt library & tools List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: libvir-list-bounces@redhat.com Errors-To: libvir-list-bounces@redhat.com X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=libvir-list-bounces@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable X-ZM-MESSAGEID: 1641941058073100001 Content-Type: text/plain; charset="utf-8" From: Vineeth Pillai Signed-off-by: Vineeth Pillai Signed-off-by: Praveen K Paladugu --- src/ch/ch_conf.c | 2 + src/ch/ch_conf.h | 4 +- src/ch/ch_domain.c | 34 +++++ src/ch/ch_domain.h | 11 +- src/ch/ch_monitor.c | 96 ++++++++++++++ src/ch/ch_monitor.h | 54 +++++++- src/ch/ch_process.c | 308 ++++++++++++++++++++++++++++++++++++++++++-- src/ch/ch_process.h | 3 + 8 files changed, 492 insertions(+), 20 deletions(-) diff --git a/src/ch/ch_conf.c b/src/ch/ch_conf.c index 98f1e89003..be12934dcd 100644 --- a/src/ch/ch_conf.c +++ b/src/ch/ch_conf.c @@ -125,6 +125,8 @@ virCHDriverConfigNew(bool privileged) if (!(cfg =3D virObjectNew(virCHDriverConfigClass))) return NULL; =20 + cfg->cgroupControllers =3D -1; /* Auto detect */ + if (privileged) { if (virGetUserID(CH_USER, &cfg->user) < 0) return NULL; diff --git a/src/ch/ch_conf.h b/src/ch/ch_conf.h index 8fe69c8545..1790295ede 100644 --- a/src/ch/ch_conf.h +++ b/src/ch/ch_conf.h @@ -35,11 +35,13 @@ struct _virCHDriverConfig { =20 char *stateDir; char *logDir; - + int cgroupControllers; uid_t user; gid_t group; }; =20 +G_DEFINE_AUTOPTR_CLEANUP_FUNC(virCHDriverConfig, virObjectUnref); + struct _virCHDriver { virMutex lock; diff --git a/src/ch/ch_domain.c b/src/ch/ch_domain.c index a746d0f5fd..6f0cec8c6e 100644 --- a/src/ch/ch_domain.c +++ b/src/ch/ch_domain.c @@ -319,6 +319,40 @@ chValidateDomainDeviceDef(const virDomainDeviceDef *de= v, _("Serial can only be enabled for a PTY")); return -1; } + return 0; +} + +int +virCHDomainRefreshThreadInfo(virDomainObj *vm) +{ + size_t maxvcpus =3D virDomainDefGetVcpusMax(vm->def); + virCHMonitorThreadInfo *info =3D NULL; + size_t nthreads, ncpus =3D 0; + size_t i; + + nthreads =3D virCHMonitorGetThreadInfo(virCHDomainGetMonitor(vm), + true, &info); + + for (i =3D 0; i < nthreads; i++) { + virCHDomainVcpuPrivate *vcpupriv; + virDomainVcpuDef *vcpu; + virCHMonitorCPUInfo *vcpuInfo; + + if (info[i].type !=3D virCHThreadTypeVcpu) + continue; + + /* TODO: hotplug support */ + vcpuInfo =3D &info[i].vcpuInfo; + vcpu =3D virDomainDefGetVcpu(vm->def, vcpuInfo->cpuid); + vcpupriv =3D CH_DOMAIN_VCPU_PRIVATE(vcpu); + vcpupriv->tid =3D vcpuInfo->tid; + ncpus++; + } + + /* TODO: Remove the warning when hotplug is implemented.*/ + if (ncpus !=3D maxvcpus) + VIR_WARN("Mismatch in the number of cpus, expected: %ld, actual: %= ld", + maxvcpus, ncpus); =20 return 0; } diff --git a/src/ch/ch_domain.h b/src/ch/ch_domain.h index 4d0b5479b8..cb94905b94 100644 --- a/src/ch/ch_domain.h +++ b/src/ch/ch_domain.h @@ -53,11 +53,13 @@ typedef struct _virCHDomainObjPrivate virCHDomainObjPri= vate; struct _virCHDomainObjPrivate { struct virCHDomainJobObj job; =20 - virChrdevs *chrdevs; - virCHDriver *driver; - virCHMonitor *monitor; char *machineName; virBitmap *autoCpuset; + virBitmap *autoNodeset; + virCHDriver *driver; + virCHMonitor *monitor; + virCgroup *cgroup; + virChrdevs *chrdevs; }; =20 #define CH_DOMAIN_PRIVATE(vm) \ @@ -87,7 +89,8 @@ void virCHDomainObjEndJob(virDomainObj *obj); =20 int -virCHDomainRefreshVcpuInfo(virDomainObj *vm); +virCHDomainRefreshThreadInfo(virDomainObj *vm); + pid_t virCHDomainGetVcpuPid(virDomainObj *vm, unsigned int vcpuid); diff --git a/src/ch/ch_monitor.c b/src/ch/ch_monitor.c index a19f0c7e33..d984bf9822 100644 --- a/src/ch/ch_monitor.c +++ b/src/ch/ch_monitor.c @@ -41,6 +41,7 @@ VIR_LOG_INIT("ch.ch_monitor"); =20 static virClass *virCHMonitorClass; static void virCHMonitorDispose(void *obj); +static void virCHMonitorThreadInfoFree(virCHMonitor * mon); =20 static int virCHMonitorOnceInit(void) { @@ -578,6 +579,7 @@ static void virCHMonitorDispose(void *opaque) virCHMonitor *mon =3D opaque; =20 VIR_DEBUG("mon=3D%p", mon); + virCHMonitorThreadInfoFree(mon); virObjectUnref(mon->vm); } =20 @@ -743,6 +745,100 @@ virCHMonitorGet(virCHMonitor *mon, const char *endpoi= nt, virJSONValue **response return ret; } =20 +static void +virCHMonitorThreadInfoFree(virCHMonitor *mon) +{ + mon->nthreads =3D 0; + if (mon->threads) + VIR_FREE(mon->threads); +} + +static size_t +virCHMonitorRefreshThreadInfo(virCHMonitor *mon) +{ + virCHMonitorThreadInfo *info =3D NULL; + g_autofree pid_t *tids =3D NULL; + virDomainObj *vm =3D mon->vm; + size_t ntids =3D 0; + size_t i; + + + virCHMonitorThreadInfoFree(mon); + if (virProcessGetPids(vm->pid, &ntids, &tids) < 0) { + mon->threads =3D NULL; + return 0; + } + + info =3D g_new0(virCHMonitorThreadInfo, ntids); + for (i =3D 0; i < ntids; i++) { + g_autofree char *proc =3D NULL; + g_autofree char *data =3D NULL; + + proc =3D g_strdup_printf("/proc/%d/task/%d/comm", + (int)vm->pid, (int)tids[i]); + + if (virFileReadAll(proc, (1 << 16), &data) < 0) { + continue; + } + + VIR_DEBUG("VM PID: %d, TID %d, COMM: %s", + (int)vm->pid, (int)tids[i], data); + if (STRPREFIX(data, "vcpu")) { + int cpuid; + char *tmp; + + if (virStrToLong_i(data + 4, &tmp, 0, &cpuid) < 0) { + VIR_WARN("Index is not specified correctly"); + continue; + } + info[i].type =3D virCHThreadTypeVcpu; + info[i].vcpuInfo.tid =3D tids[i]; + info[i].vcpuInfo.online =3D true; + info[i].vcpuInfo.cpuid =3D cpuid; + VIR_DEBUG("vcpu%d -> tid: %d", cpuid, tids[i]); + } else if (STRPREFIX(data, "_disk") || STRPREFIX(data, "_net") || + STRPREFIX(data, "_rng")) { + /* Prefixes used by cloud-hypervisor for IO Threads are captur= ed at + * https://github.com/cloud-hypervisor/cloud-hypervisor/blob/m= ain/vmm/src/device_manager.rs */ + info[i].type =3D virCHThreadTypeIO; + info[i].ioInfo.tid =3D tids[i]; + virStrcpy(info[i].ioInfo.thrName, data, VIRCH_THREAD_NAME_LEN = - 1); + } else { + info[i].type =3D virCHThreadTypeEmulator; + info[i].emuInfo.tid =3D tids[i]; + virStrcpy(info[i].emuInfo.thrName, data, VIRCH_THREAD_NAME_LEN= - 1); + } + mon->nthreads++; + + } + mon->threads =3D info; + + return mon->nthreads; +} + +/** + * virCHMonitorGetThreadInfo: + * @mon: Pointer to the monitor + * @refresh: Refresh thread info or not + * + * Retrive thread info and store to @threads + * + * Returns count of threads on success. + */ +size_t +virCHMonitorGetThreadInfo(virCHMonitor *mon, bool refresh, + virCHMonitorThreadInfo **threads) +{ + int nthreads =3D 0; + + if (refresh) + nthreads =3D virCHMonitorRefreshThreadInfo(mon); + + *threads =3D mon->threads; + + return nthreads; +} + int virCHMonitorShutdownVMM(virCHMonitor *mon) { diff --git a/src/ch/ch_monitor.h b/src/ch/ch_monitor.h index f3b6978366..6646316454 100644 --- a/src/ch/ch_monitor.h +++ b/src/ch/ch_monitor.h @@ -37,6 +37,50 @@ #define URL_VM_RESUME "vm.resume" #define URL_VM_INFO "vm.info" =20 +#define VIRCH_THREAD_NAME_LEN 16 + +typedef enum { + virCHThreadTypeEmulator, + virCHThreadTypeVcpu, + virCHThreadTypeIO, + virCHThreadTypeMax +} virCHThreadType; + +typedef struct _virCHMonitorCPUInfo virCHMonitorCPUInfo; + +struct _virCHMonitorCPUInfo { + int cpuid; + pid_t tid; + + bool online; +}; + +typedef struct _virCHMonitorEmuThreadInfo virCHMonitorEmuThreadInfo; + +struct _virCHMonitorEmuThreadInfo { + char thrName[VIRCH_THREAD_NAME_LEN]; + pid_t tid; +}; + +typedef struct _virCHMonitorIOThreadInfo virCHMonitorIOThreadInfo; + +struct _virCHMonitorIOThreadInfo { + char thrName[VIRCH_THREAD_NAME_LEN]; + pid_t tid; +}; + +typedef struct _virCHMonitorThreadInfo virCHMonitorThreadInfo; + +struct _virCHMonitorThreadInfo { + virCHThreadType type; + + union { + virCHMonitorCPUInfo vcpuInfo; + virCHMonitorEmuThreadInfo emuInfo; + virCHMonitorIOThreadInfo ioInfo; + }; +}; + typedef struct _virCHMonitor virCHMonitor; =20 struct _virCHMonitor { @@ -49,6 +93,9 @@ struct _virCHMonitor { pid_t pid; =20 virDomainObj *vm; + + size_t nthreads; + virCHMonitorThreadInfo *threads; }; =20 virCHMonitor *virCHMonitorNew(virDomainObj *vm, const char *socketdir); @@ -66,12 +113,9 @@ int virCHMonitorSuspendVM(virCHMonitor *mon); int virCHMonitorResumeVM(virCHMonitor *mon); int virCHMonitorGetInfo(virCHMonitor *mon, virJSONValue **info); =20 -typedef struct _virCHMonitorCPUInfo virCHMonitorCPUInfo; -struct _virCHMonitorCPUInfo { - pid_t tid; - bool online; -}; void virCHMonitorCPUInfoFree(virCHMonitorCPUInfo *cpus); int virCHMonitorGetCPUInfo(virCHMonitor *mon, virCHMonitorCPUInfo **vcpus, size_t maxvcpus); +size_t virCHMonitorGetThreadInfo(virCHMonitor *mon, bool refresh, + virCHMonitorThreadInfo **threads); diff --git a/src/ch/ch_process.c b/src/ch/ch_process.c index 49976d769e..1a0730a4d1 100644 --- a/src/ch/ch_process.c +++ b/src/ch/ch_process.c @@ -26,6 +26,8 @@ #include "ch_domain.h" #include "ch_monitor.h" #include "ch_process.h" +#include "domain_cgroup.h" +#include "virnuma.h" #include "viralloc.h" #include "virerror.h" #include "virjson.h" @@ -131,6 +133,254 @@ virCHProcessUpdateInfo(virDomainObj *vm) return 0; } =20 +static int +virCHProcessGetAllCpuAffinity(virBitmap **cpumapRet) +{ + *cpumapRet =3D NULL; + + if (!virHostCPUHasBitmap()) + return 0; + + if (!(*cpumapRet =3D virHostCPUGetOnlineBitmap())) + return -1; + + return 0; +} + +#if defined(WITH_SCHED_GETAFFINITY) || defined(WITH_BSD_CPU_AFFINITY) +static int +virCHProcessInitCpuAffinity(virDomainObj *vm) +{ + g_autoptr(virBitmap) cpumapToSet =3D NULL; + virDomainNumatuneMemMode mem_mode; + virCHDomainObjPrivate *priv =3D vm->privateData; + + if (!vm->pid) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Cannot setup CPU affinity until process is start= ed")); + return -1; + } + + if (virDomainNumaGetNodeCount(vm->def->numa) <=3D 1 && + virDomainNumatuneGetMode(vm->def->numa, -1, &mem_mode) =3D=3D 0 && + mem_mode =3D=3D VIR_DOMAIN_NUMATUNE_MEM_STRICT) { + virBitmap *nodeset =3D NULL; + + if (virDomainNumatuneMaybeGetNodeset(vm->def->numa, + priv->autoNodeset, + &nodeset, -1) < 0) + return -1; + + if (virNumaNodesetToCPUset(nodeset, &cpumapToSet) < 0) + return -1; + } else if (vm->def->cputune.emulatorpin) { + if (!(cpumapToSet =3D virBitmapNewCopy(vm->def->cputune.emulatorpi= n))) + return -1; + } else { + if (virCHProcessGetAllCpuAffinity(&cpumapToSet) < 0) + return -1; + } + + if (cpumapToSet && virProcessSetAffinity(vm->pid, cpumapToSet, false) = < 0) { + return -1; + } + + return 0; +} +#else /* !defined(WITH_SCHED_GETAFFINITY) && !defined(WITH_BSD_CPU_AFFINIT= Y) */ +static int +virCHProcessInitCpuAffinity(virDomainObj *vm G_GNUC_UNUSED) +{ + return 0; +} +#endif /* !defined(WITH_SCHED_GETAFFINITY) && !defined(WITH_BSD_CPU_AFFINI= TY) */ + +/** + * virCHProcessSetupPid: + * + * This function sets resource properties (affinity, cgroups, + * scheduler) for any PID associated with a domain. It should be used + * to set up emulator PIDs as well as vCPU and I/O thread pids to + * ensure they are all handled the same way. + * + * Returns 0 on success, -1 on error. + */ +static int +virCHProcessSetupPid(virDomainObj *vm, + pid_t pid, + virCgroupThreadName nameval, + int id, + virBitmap *cpumask, + unsigned long long period, + long long quota, + virDomainThreadSchedParam *sched) +{ + virCHDomainObjPrivate *priv =3D vm->privateData; + virDomainNumatuneMemMode mem_mode; + virCgroup *cgroup =3D NULL; + virBitmap *use_cpumask =3D NULL; + virBitmap *affinity_cpumask =3D NULL; + g_autoptr(virBitmap) hostcpumap =3D NULL; + g_autofree char *mem_mask =3D NULL; + int ret =3D -1; + + if ((period || quota) && + !virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPU)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("cgroup cpu is required for scheduler tuning")); + goto cleanup; + } + + /* Infer which cpumask shall be used. */ + if (cpumask) { + use_cpumask =3D cpumask; + } else if (vm->def->placement_mode =3D=3D VIR_DOMAIN_CPU_PLACEMENT_MOD= E_AUTO) { + use_cpumask =3D priv->autoCpuset; + } else if (vm->def->cpumask) { + use_cpumask =3D vm->def->cpumask; + } else { + /* we can't assume cloud-hypervisor itself is running on all pCPUs, + * so we need to explicitly set the spawned instance to all pCPUs.= */ + if (virCHProcessGetAllCpuAffinity(&hostcpumap) < 0) + goto cleanup; + affinity_cpumask =3D hostcpumap; + } + + /* + * If CPU cgroup controller is not initialized here, then we need + * neither period nor quota settings. And if CPUSET controller is + * not initialized either, then there's nothing to do anyway. + */ + if (virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPU) || + virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPUSET)= ) { + + if (virDomainNumatuneGetMode(vm->def->numa, -1, &mem_mode) =3D=3D = 0 && + mem_mode =3D=3D VIR_DOMAIN_NUMATUNE_MEM_STRICT && + virDomainNumatuneMaybeFormatNodeset(vm->def->numa, + priv->autoNodeset, + &mem_mask, -1) < 0) + goto cleanup; + + if (virCgroupNewThread(priv->cgroup, nameval, id, true, &cgroup) <= 0) + goto cleanup; + + if (virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPU= SET)) { + if (use_cpumask && + virDomainCgroupSetupCpusetCpus(cgroup, use_cpumask) < 0) + goto cleanup; + + if (mem_mask && virCgroupSetCpusetMems(cgroup, mem_mask) < 0) + goto cleanup; + + } + + if ((period || quota) && + virDomainCgroupSetupVcpuBW(cgroup, period, quota) < 0) + goto cleanup; + + /* Move the thread to the sub dir */ + VIR_INFO("Adding pid %d to cgroup", pid); + if (virCgroupAddThread(cgroup, pid) < 0) + goto cleanup; + + } + + if (!affinity_cpumask) + affinity_cpumask =3D use_cpumask; + + /* Setup legacy affinity. */ + if (affinity_cpumask + && virProcessSetAffinity(pid, affinity_cpumask, false) < 0) + goto cleanup; + + /* Set scheduler type and priority, but not for the main thread. */ + if (sched && + nameval !=3D VIR_CGROUP_THREAD_EMULATOR && + virProcessSetScheduler(pid, sched->policy, sched->priority) < 0) + goto cleanup; + + ret =3D 0; + cleanup: + if (cgroup) { + if (ret < 0) + virCgroupRemove(cgroup); + virCgroupFree(cgroup); + } + + return ret; +} + +/** + * virCHProcessSetupVcpu: + * @vm: domain object + * @vcpuid: id of VCPU to set defaults + * + * This function sets resource properties (cgroups, affinity, scheduler) f= or a + * vCPU. This function expects that the vCPU is online and the vCPU pids w= ere + * correctly detected at the point when it's called. + * + * Returns 0 on success, -1 on error. + */ +int +virCHProcessSetupVcpu(virDomainObj *vm, unsigned int vcpuid) +{ + pid_t vcpupid =3D virCHDomainGetVcpuPid(vm, vcpuid); + virDomainVcpuDef *vcpu =3D virDomainDefGetVcpu(vm->def, vcpuid); + + return virCHProcessSetupPid(vm, vcpupid, VIR_CGROUP_THREAD_VCPU, + vcpuid, vcpu->cpumask, + vm->def->cputune.period, + vm->def->cputune.quota, &vcpu->sched); +} + +static int +virCHProcessSetupVcpus(virDomainObj *vm) +{ + virDomainVcpuDef *vcpu; + unsigned int maxvcpus =3D virDomainDefGetVcpusMax(vm->def); + size_t i; + + if ((vm->def->cputune.period || vm->def->cputune.quota) && + !virCgroupHasController(((virCHDomainObjPrivate *) vm->privateData= )-> + cgroup, VIR_CGROUP_CONTROLLER_CPU)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("cgroup cpu is required for scheduler tuning")); + return -1; + } + + if (!virCHDomainHasVcpuPids(vm)) { + /* If any CPU has custom affinity that differs from the + * VM default affinity, we must reject it */ + for (i =3D 0; i < maxvcpus; i++) { + vcpu =3D virDomainDefGetVcpu(vm->def, i); + + if (!vcpu->online) + continue; + + if (vcpu->cpumask && + !virBitmapEqual(vm->def->cpumask, vcpu->cpumask)) { + virReportError(VIR_ERR_OPERATION_INVALID, "%s", + _("cpu affinity is not supported")); + return -1; + } + } + + return 0; + } + + for (i =3D 0; i < maxvcpus; i++) { + vcpu =3D virDomainDefGetVcpu(vm->def, i); + + if (!vcpu->online) + continue; + + if (virCHProcessSetupVcpu(vm, i) < 0) + return -1; + } + + return 0; +} + /** * virCHProcessStart: * @driver: pointer to driver structure @@ -141,12 +391,13 @@ virCHProcessUpdateInfo(virDomainObj *vm) * * Returns 0 on success or -1 in case of error */ -int virCHProcessStart(virCHDriver *driver, - virDomainObj *vm, - virDomainRunningReason reason) +int +virCHProcessStart(virCHDriver *driver, + virDomainObj *vm, virDomainRunningReason reason) { int ret =3D -1; virCHDomainObjPrivate *priv =3D vm->privateData; + g_autoptr(virCHDriverConfig) cfg =3D virCHDriverGetConfig(priv->driver= ); g_autofree int *nicindexes =3D NULL; size_t nnicindexes =3D 0; =20 @@ -166,18 +417,41 @@ int virCHProcessStart(virCHDriver *driver, } } =20 + vm->pid =3D priv->monitor->pid; + vm->def->id =3D vm->pid; + priv->machineName =3D virCHDomainGetMachineName(vm); + + if (virDomainCgroupSetupCgroup("ch", vm, + nnicindexes, nicindexes, + priv->cgroup, + cfg->cgroupControllers, + 0, /*maxThreadsPerProc*/ + priv->driver->privileged, + priv->machineName) < 0) + goto cleanup; + + if (virCHProcessInitCpuAffinity(vm) < 0) + goto cleanup; + if (virCHMonitorBootVM(priv->monitor) < 0) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("failed to boot guest VM")); goto cleanup; } =20 - priv->machineName =3D virCHDomainGetMachineName(vm); - vm->pid =3D priv->monitor->pid; - vm->def->id =3D vm->pid; + virCHDomainRefreshThreadInfo(vm); =20 - virCHProcessUpdateInfo(vm); + VIR_DEBUG("Setting global CPU cgroup (if required)"); + if (virDomainCgroupSetupGlobalCpuCgroup(vm, + priv->cgroup, + priv->autoNodeset) < 0) + goto cleanup; + + VIR_DEBUG("Setting vCPU tuning/settings"); + if (virCHProcessSetupVcpus(vm) < 0) + goto cleanup; =20 + virCHProcessUpdateInfo(vm); virDomainObjSetState(vm, VIR_DOMAIN_RUNNING, reason); =20 return 0; @@ -189,10 +463,12 @@ int virCHProcessStart(virCHDriver *driver, return ret; } =20 -int virCHProcessStop(virCHDriver *driver G_GNUC_UNUSED, - virDomainObj *vm, - virDomainShutoffReason reason) +int +virCHProcessStop(virCHDriver *driver G_GNUC_UNUSED, + virDomainObj *vm, virDomainShutoffReason reason) { + int ret; + int retries =3D 0; virCHDomainObjPrivate *priv =3D vm->privateData; =20 VIR_DEBUG("Stopping VM name=3D%s pid=3D%d reason=3D%d", @@ -203,6 +479,18 @@ int virCHProcessStop(virCHDriver *driver G_GNUC_UNUSED, priv->monitor =3D NULL; } =20 + retry: + if ((ret =3D virDomainCgroupRemoveCgroup(vm, + priv->cgroup, + priv->machineName)) < 0) { + if (ret =3D=3D -EBUSY && (retries++ < 5)) { + g_usleep(200*1000); + goto retry; + } + VIR_WARN("Failed to remove cgroup for %s", + vm->def->name); + } + vm->pid =3D -1; vm->def->id =3D -1; g_clear_pointer(&priv->machineName, g_free); diff --git a/src/ch/ch_process.h b/src/ch/ch_process.h index abc4915979..800e3f4e23 100644 --- a/src/ch/ch_process.h +++ b/src/ch/ch_process.h @@ -29,3 +29,6 @@ int virCHProcessStart(virCHDriver *driver, int virCHProcessStop(virCHDriver *driver, virDomainObj *vm, virDomainShutoffReason reason); + +int virCHProcessSetupVcpu(virDomainObj *vm, + unsigned int vcpuid); --=20 2.27.0