[libvirt] [PATCH 0/4] qemu: Honor memory mode='strict'

Michal Privoznik posted 4 patches 5 years ago
Test syntax-check passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/libvirt tags/patchew/cover.1554818578.git.mprivozn@redhat.com
There is a newer version of this series
src/qemu/qemu_cgroup.c | 12 +++++++-----
src/qemu/qemu_cgroup.h |  1 -
2 files changed, 7 insertions(+), 6 deletions(-)
[libvirt] [PATCH 0/4] qemu: Honor memory mode='strict'
Posted by Michal Privoznik 5 years ago
If there's a domain configured as:

  <currentMemory unit='MiB'>4096</currentMemory>
  <numatune>
    <memory mode='strict' nodeset='1'/>
  </numatune>

but there is not enough memory on NUMA node 1 the domain will start
successfully because we allow it to allocate memory from other nodes.
This is a result of some previous fix (v1.2.7-rc1~91). However, I've
tested my fix successfully on a NUMA machine with F29 and recent kernel.
So the kernel bug I'm mentioning in 4/4 is probably fixed then and we
can drop the workaround.

Michal Prívozník (4):
  qemuSetupCpusetMems: Use VIR_AUTOFREE()
  qemuSetupCpusetMems: Create EMULATOR thread upfront
  qemu_cgroup: Make qemuSetupCpusetMems static
  qemuSetupCpusetCgroup: Set up cpuset.mems before execing qemu

 src/qemu/qemu_cgroup.c | 12 +++++++-----
 src/qemu/qemu_cgroup.h |  1 -
 2 files changed, 7 insertions(+), 6 deletions(-)

-- 
2.21.0

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 0/4] qemu: Honor memory mode='strict'
Posted by Daniel Henrique Barboza 5 years ago
Hi,

On 4/9/19 11:10 AM, Michal Privoznik wrote:
> If there's a domain configured as:
>
>    <currentMemory unit='MiB'>4096</currentMemory>
>    <numatune>
>      <memory mode='strict' nodeset='1'/>
>    </numatune>
>
> but there is not enough memory on NUMA node 1 the domain will start
> successfully because we allow it to allocate memory from other nodes.
> This is a result of some previous fix (v1.2.7-rc1~91). However, I've
> tested my fix successfully on a NUMA machine with F29 and recent kernel.
> So the kernel bug I'm mentioning in 4/4 is probably fixed then and we
> can drop the workaround.

I've tested out of curiosity your patch set in a Power8 system to see
if I could spot a difference, but in my case it didn't change the behavior.

I've tried a guest with the following numatune:


   <memory unit='KiB'>67108864</memory>
   <currentMemory unit='KiB'>67108864</currentMemory>
   <vcpu placement='static' current='4'>16</vcpu>
   <numatune>
     <memory mode='strict' nodeset='0'/>
   </numatune>

This is the numa setup of the host:

$ numactl -H
available: 4 nodes (0-1,16-17)
node 0 cpus: 0 8 16 24 32 40
node 0 size: 32606 MB
node 0 free: 24125 MB
node 1 cpus: 48 56 64 72 80 88
node 1 size: 32704 MB
node 1 free: 27657 MB
node 16 cpus: 96 104 112 120 128 136
node 16 size: 32704 MB
node 16 free: 25455 MB
node 17 cpus: 144 152 160 168 176 184
node 17 size: 32565 MB
node 17 free: 30030 MB
node distances:
node   0   1  16  17
   0:  10  20  40  40
   1:  20  10  40  40
  16:  40  40  10  20
  17:  40  40  20  10


If I understood it right, the patches removed the capability to allocate
memory from different numa nodes with the 'strict' setting, making the
guest failing to launch if the numa node does not have enough memory.
Unless I am getting something wrong, this guest shouldn't launch after
applying this patch (node 0 does not have 64Gb available). But the guest
is launching as if nothing changed.


I'll dig it further if I have the chance. I'm just curious if this is 
something
that works differently with pseries guests.


Thanks,

DHB


>
> Michal Prívozník (4):
>    qemuSetupCpusetMems: Use VIR_AUTOFREE()
>    qemuSetupCpusetMems: Create EMULATOR thread upfront
>    qemu_cgroup: Make qemuSetupCpusetMems static
>    qemuSetupCpusetCgroup: Set up cpuset.mems before execing qemu
>
>   src/qemu/qemu_cgroup.c | 12 +++++++-----
>   src/qemu/qemu_cgroup.h |  1 -
>   2 files changed, 7 insertions(+), 6 deletions(-)
>

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 0/4] qemu: Honor memory mode='strict'
Posted by Michal Privoznik 5 years ago
On 4/10/19 12:35 AM, Daniel Henrique Barboza wrote:
> Hi,
> 
> On 4/9/19 11:10 AM, Michal Privoznik wrote:
>> If there's a domain configured as:
>>
>>    <currentMemory unit='MiB'>4096</currentMemory>
>>    <numatune>
>>      <memory mode='strict' nodeset='1'/>
>>    </numatune>
>>
>> but there is not enough memory on NUMA node 1 the domain will start
>> successfully because we allow it to allocate memory from other nodes.
>> This is a result of some previous fix (v1.2.7-rc1~91). However, I've
>> tested my fix successfully on a NUMA machine with F29 and recent kernel.
>> So the kernel bug I'm mentioning in 4/4 is probably fixed then and we
>> can drop the workaround.
> 
> I've tested out of curiosity your patch set in a Power8 system to see
> if I could spot a difference, but in my case it didn't change the behavior.
> 
> I've tried a guest with the following numatune:
> 
> 
>    <memory unit='KiB'>67108864</memory>
>    <currentMemory unit='KiB'>67108864</currentMemory>
>    <vcpu placement='static' current='4'>16</vcpu>
>    <numatune>
>      <memory mode='strict' nodeset='0'/>
>    </numatune>
> 
> This is the numa setup of the host:
> 
> $ numactl -H
> available: 4 nodes (0-1,16-17)
> node 0 cpus: 0 8 16 24 32 40
> node 0 size: 32606 MB
> node 0 free: 24125 MB
> node 1 cpus: 48 56 64 72 80 88
> node 1 size: 32704 MB
> node 1 free: 27657 MB
> node 16 cpus: 96 104 112 120 128 136
> node 16 size: 32704 MB
> node 16 free: 25455 MB
> node 17 cpus: 144 152 160 168 176 184
> node 17 size: 32565 MB
> node 17 free: 30030 MB
> node distances:
> node   0   1  16  17
>    0:  10  20  40  40
>    1:  20  10  40  40
>   16:  40  40  10  20
>   17:  40  40  20  10
> 
> 
> If I understood it right, the patches removed the capability to allocate
> memory from different numa nodes with the 'strict' setting, making the
> guest failing to launch if the numa node does not have enough memory.
> Unless I am getting something wrong, this guest shouldn't launch after
> applying this patch (node 0 does not have 64Gb available). But the guest
> is launching as if nothing changed.
> 
> 
> I'll dig it further if I have the chance. I'm just curious if this is 
> something
> that works differently with pseries guests.

Hey,

firstly thanks for testing this! And yes, it shows flaw in my patches. 
Thing is, my patches set up emulator/cpuset.mems but at the time qemu is 
doing its allocation it is still living under top level CGroup (which is 
left untouched). Will post v2! Thanks.

Michal

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list