[libvirt] [PATCH] Fix guest boot failure when vcpu placement="auto" on memoryless numa node

Nitesh Konkar posted 1 patch 6 years, 9 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/libvirt tags/patchew/20170710063936.18982-1-niteshkonkar.libvirt@gmail.com
src/qemu/qemu_cgroup.c  | 4 +++-
src/qemu/qemu_process.c | 5 ++++-
2 files changed, 7 insertions(+), 2 deletions(-)
[libvirt] [PATCH] Fix guest boot failure when vcpu placement="auto" on memoryless numa node
Posted by Nitesh Konkar 6 years, 9 months ago
When the vcpu placement is auto and we have memoryless numa nodes on the host,
numad returns a list numa nodes with and without memory. When we try to write it
to /sys/fs/cgroup/*/cpuset.mems it errors out as invlaid argument.

Signed-off-by: Nitesh Konkar <nitkon12@linux.vnet.ibm.com>
---
numactl --hardware
available: 4 nodes (0-1,16-17)
node 0 cpus: 0 8 16 24 32
node 0 size: 32500 MB
node 0 free: 25584 MB
node 1 cpus: 40 48 56 64 72
node 1 size: 0 MB----------------------------------------------#
node 1 free: 0 MB
node 16 cpus: 80 88 96 104 112
node 16 size: 32613 MB
node 16 free: 30991 MB
node 17 cpus: 120 128 136 144 152
node 17 size: 0 MB--------------------------------------------#
node 17 free: 0 MB
node distances:
node   0   1  16  17 
  0:  10  20  40  40 
  1:  20  10  40  40 
 16:  40  40  10  20 
 17:  40  40  20  10 

virsh start virt-tests-vm1
error: Failed to start domain virt-tests-vm1
error: Invalid value '0-1,16-17' for 'cpuset.mems': Invalid argument--------------NOK

 src/qemu/qemu_cgroup.c  | 4 +++-
 src/qemu/qemu_process.c | 5 ++++-
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c
index 36762d4..fd8deb1 100644
--- a/src/qemu/qemu_cgroup.c
+++ b/src/qemu/qemu_cgroup.c
@@ -723,10 +723,12 @@ qemuSetupCpusetMems(virDomainObjPtr vm)
 {
     virCgroupPtr cgroup_temp = NULL;
     qemuDomainObjPrivatePtr priv = vm->privateData;
+    virBitmapPtr nodeSet = NULL;
     virDomainNumatuneMemMode mode;
     char *mem_mask = NULL;
     int ret = -1;
 
+    nodeSet = virNumaGetHostMemoryNodeset(); 
     if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPUSET))
         return 0;
 
@@ -735,7 +737,7 @@ qemuSetupCpusetMems(virDomainObjPtr vm)
         return 0;
 
     if (virDomainNumatuneMaybeFormatNodeset(vm->def->numa,
-                                            priv->autoNodeset,
+                                            nodeSet,
                                             &mem_mask, -1) < 0)
         goto cleanup;
 
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
index fa9990e..074a0cd 100644
--- a/src/qemu/qemu_process.c
+++ b/src/qemu/qemu_process.c
@@ -2374,6 +2374,7 @@ qemuProcessSetupPid(virDomainObjPtr vm,
     virDomainNumatuneMemMode mem_mode;
     virCgroupPtr cgroup = NULL;
     virBitmapPtr use_cpumask;
+    virBitmapPtr nodeSet = NULL;
     char *mem_mask = NULL;
     int ret = -1;
 
@@ -2397,13 +2398,15 @@ qemuProcessSetupPid(virDomainObjPtr vm,
      * neither period nor quota settings.  And if CPUSET controller is
      * not initialized either, then there's nothing to do anyway.
      */
+    nodeSet = virNumaGetHostMemoryNodeset(); 
+
     if (virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPU) ||
         virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPUSET)) {
 
         if (virDomainNumatuneGetMode(vm->def->numa, -1, &mem_mode) == 0 &&
             mem_mode == VIR_DOMAIN_NUMATUNE_MEM_STRICT &&
             virDomainNumatuneMaybeFormatNodeset(vm->def->numa,
-                                                priv->autoNodeset,
+                                                nodeSet,
                                                 &mem_mask, -1) < 0)
             goto cleanup;
 
-- 
1.8.3.1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] Fix guest boot failure when vcpu placement="auto" on memoryless numa node
Posted by Peter Krempa 6 years, 9 months ago
On Mon, Jul 10, 2017 at 12:09:36 +0530, Nitesh Konkar wrote:
> When the vcpu placement is auto and we have memoryless numa nodes on the host,
> numad returns a list numa nodes with and without memory. When we try to write it
> to /sys/fs/cgroup/*/cpuset.mems it errors out as invlaid argument.
> 
> Signed-off-by: Nitesh Konkar <nitkon12@linux.vnet.ibm.com>
> ---
> numactl --hardware
> available: 4 nodes (0-1,16-17)
> node 0 cpus: 0 8 16 24 32
> node 0 size: 32500 MB
> node 0 free: 25584 MB
> node 1 cpus: 40 48 56 64 72
> node 1 size: 0 MB----------------------------------------------#
> node 1 free: 0 MB
> node 16 cpus: 80 88 96 104 112
> node 16 size: 32613 MB
> node 16 free: 30991 MB
> node 17 cpus: 120 128 136 144 152
> node 17 size: 0 MB--------------------------------------------#
> node 17 free: 0 MB
> node distances:
> node   0   1  16  17 
>   0:  10  20  40  40 
>   1:  20  10  40  40 
>  16:  40  40  10  20 
>  17:  40  40  20  10 
> 
> virsh start virt-tests-vm1
> error: Failed to start domain virt-tests-vm1
> error: Invalid value '0-1,16-17' for 'cpuset.mems': Invalid argument--------------NOK
> 
>  src/qemu/qemu_cgroup.c  | 4 +++-
>  src/qemu/qemu_process.c | 5 ++++-
>  2 files changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c
> index 36762d4..fd8deb1 100644
> --- a/src/qemu/qemu_cgroup.c
> +++ b/src/qemu/qemu_cgroup.c
> @@ -723,10 +723,12 @@ qemuSetupCpusetMems(virDomainObjPtr vm)
>  {
>      virCgroupPtr cgroup_temp = NULL;
>      qemuDomainObjPrivatePtr priv = vm->privateData;
> +    virBitmapPtr nodeSet = NULL;
>      virDomainNumatuneMemMode mode;
>      char *mem_mask = NULL;
>      int ret = -1;
>  
> +    nodeSet = virNumaGetHostMemoryNodeset(); 

So this returns a list of host's numa nodes with memory ...

>      if (!virCgroupHasController(priv->cgroup, VIR_CGROUP_CONTROLLER_CPUSET))
>          return 0;
>  
> @@ -735,7 +737,7 @@ qemuSetupCpusetMems(virDomainObjPtr vm)
>          return 0;
>  
>      if (virDomainNumatuneMaybeFormatNodeset(vm->def->numa,
> -                                            priv->autoNodeset,
> +                                            nodeSet,

... thus here you'd use all of nodes containing memory instead of what
numad told us. We need to subtract those bitmaps so that we only get
nodes with memory which also contain some memory.

For this I'd introduce a new bitmap, similar to autoNodeset and
autoCpuset which will have nodes with memory. It's also questionable
whether we need any of this and can't just reuse autoNodeset for this
(beware It's not that easy, since it's used to derive autoCpuset, thus
they need to be refactored carefuly.)

I think I'll send a patch with my suggested changes.
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list