[Qemu-devel] [PATCH] numa, spapr: align default numa node memory size to 256MB

Laurent Vivier posted 1 patch 7 years, 1 month ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20170320112426.4030-1-lvivier@redhat.com
Test s390x passed
There is a newer version of this series
dtc    |  2 +-
numa.c | 14 +++++++++-----
2 files changed, 10 insertions(+), 6 deletions(-)
[Qemu-devel] [PATCH] numa, spapr: align default numa node memory size to 256MB
Posted by Laurent Vivier 7 years, 1 month ago
Since commit 224245b ("spapr: Add LMB DR connectors"), NUMA node
memory size must be aligned to 256MB (SPAPR_MEMORY_BLOCK_SIZE).

But when "-numa" option is provided without "mem" parameter,
the memory is equally divided between nodes, but 8MB aligned.
This can be not valid for pseries.

In that case we can have:
$ ./ppc64-softmmu/qemu-system-ppc64 -m 4G -numa node -numa node -numa node
qemu-system-ppc64: Node 0 memory size 0x55000000 is not aligned to 256 MiB

With this patch, we have:
(qemu) info numa
3 nodes
node 0 cpus: 0
node 0 size: 1280 MB
node 1 cpus:
node 1 size: 1280 MB
node 2 cpus:
node 2 size: 1536 MB

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
---
 dtc    |  2 +-
 numa.c | 14 +++++++++-----
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/dtc b/dtc
index 558cd81..fa8bc7f 160000
--- a/dtc
+++ b/dtc
@@ -1 +1 @@
-Subproject commit 558cd81bdd432769b59bff01240c44f82cfb1a9d
+Subproject commit fa8bc7f928ac25f23532afc8beb2073efc8fb063
diff --git a/numa.c b/numa.c
index e01cb54..a911284 100644
--- a/numa.c
+++ b/numa.c
@@ -337,15 +337,19 @@ void parse_numa_opts(MachineClass *mc)
         }
         if (i == nb_numa_nodes) {
             uint64_t usedmem = 0;
-
-            /* On Linux, each node's border has to be 8MB aligned,
-             * the final node gets the rest.
-             */
+#if defined(TARGET_PPC64)
+            /* pseries requests each node's border has to be 256 MB aligned */
+            const uint64_t numa_mem_align_mask = ~((1 << 28UL) - 1);
+#else
+            /* On Linux, each node's border has to be 8MB aligned */
+            const uint64_t numa_mem_align_mask = ~((1 << 23UL)  - 1);
+#endif
             for (i = 0; i < nb_numa_nodes - 1; i++) {
                 numa_info[i].node_mem = (ram_size / nb_numa_nodes) &
-                                        ~((1 << 23UL) - 1);
+                                        numa_mem_align_mask;
                 usedmem += numa_info[i].node_mem;
             }
+            /* the final node gets the rest. */
             numa_info[i].node_mem = ram_size - usedmem;
         }
 
-- 
2.9.3


Re: [Qemu-devel] [PATCH] numa, spapr: align default numa node memory size to 256MB
Posted by Daniel P. Berrange 7 years, 1 month ago
On Mon, Mar 20, 2017 at 12:24:26PM +0100, Laurent Vivier wrote:
> Since commit 224245b ("spapr: Add LMB DR connectors"), NUMA node
> memory size must be aligned to 256MB (SPAPR_MEMORY_BLOCK_SIZE).

That commit only enabled the feature for the pseries-2.5 machine type
several releases back now...

> But when "-numa" option is provided without "mem" parameter,
> the memory is equally divided between nodes, but 8MB aligned.
> This can be not valid for pseries.
> 
> In that case we can have:
> $ ./ppc64-softmmu/qemu-system-ppc64 -m 4G -numa node -numa node -numa node
> qemu-system-ppc64: Node 0 memory size 0x55000000 is not aligned to 256 MiB
> 
> With this patch, we have:
> (qemu) info numa
> 3 nodes
> node 0 cpus: 0
> node 0 size: 1280 MB
> node 1 cpus:
> node 1 size: 1280 MB
> node 2 cpus:
> node 2 size: 1536 MB
> 
> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
> ---
>  dtc    |  2 +-
>  numa.c | 14 +++++++++-----
>  2 files changed, 10 insertions(+), 6 deletions(-)
> 
> diff --git a/dtc b/dtc
> index 558cd81..fa8bc7f 160000
> --- a/dtc
> +++ b/dtc
> @@ -1 +1 @@
> -Subproject commit 558cd81bdd432769b59bff01240c44f82cfb1a9d
> +Subproject commit fa8bc7f928ac25f23532afc8beb2073efc8fb063

This looks unrelated

> diff --git a/numa.c b/numa.c
> index e01cb54..a911284 100644
> --- a/numa.c
> +++ b/numa.c
> @@ -337,15 +337,19 @@ void parse_numa_opts(MachineClass *mc)
>          }
>          if (i == nb_numa_nodes) {
>              uint64_t usedmem = 0;
> -
> -            /* On Linux, each node's border has to be 8MB aligned,
> -             * the final node gets the rest.
> -             */
> +#if defined(TARGET_PPC64)
> +            /* pseries requests each node's border has to be 256 MB aligned */
> +            const uint64_t numa_mem_align_mask = ~((1 << 28UL) - 1);
> +#else

but here you're forcing 256 MB alignement for all machine types. This is
surely breaking machine ABI compat for anyone who previously used -numa
with QEMU and upgrades to new QEMU expecting the same ABI.


Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|

Re: [Qemu-devel] [PATCH] numa, spapr: align default numa node memory size to 256MB
Posted by Laurent Vivier 7 years, 1 month ago
On 20/03/2017 12:30, Daniel P. Berrange wrote:
> On Mon, Mar 20, 2017 at 12:24:26PM +0100, Laurent Vivier wrote:
>> Since commit 224245b ("spapr: Add LMB DR connectors"), NUMA node
>> memory size must be aligned to 256MB (SPAPR_MEMORY_BLOCK_SIZE).
> 
> That commit only enabled the feature for the pseries-2.5 machine type
> several releases back now...
> 
>> But when "-numa" option is provided without "mem" parameter,
>> the memory is equally divided between nodes, but 8MB aligned.
>> This can be not valid for pseries.
>>
>> In that case we can have:
>> $ ./ppc64-softmmu/qemu-system-ppc64 -m 4G -numa node -numa node -numa node
>> qemu-system-ppc64: Node 0 memory size 0x55000000 is not aligned to 256 MiB
>>
>> With this patch, we have:
>> (qemu) info numa
>> 3 nodes
>> node 0 cpus: 0
>> node 0 size: 1280 MB
>> node 1 cpus:
>> node 1 size: 1280 MB
>> node 2 cpus:
>> node 2 size: 1536 MB
>>
>> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
>> ---
>>  dtc    |  2 +-
>>  numa.c | 14 +++++++++-----
>>  2 files changed, 10 insertions(+), 6 deletions(-)
>>
>> diff --git a/dtc b/dtc
>> index 558cd81..fa8bc7f 160000
>> --- a/dtc
>> +++ b/dtc
>> @@ -1 +1 @@
>> -Subproject commit 558cd81bdd432769b59bff01240c44f82cfb1a9d
>> +Subproject commit fa8bc7f928ac25f23532afc8beb2073efc8fb063
> 
> This looks unrelated
> 
>> diff --git a/numa.c b/numa.c
>> index e01cb54..a911284 100644
>> --- a/numa.c
>> +++ b/numa.c
>> @@ -337,15 +337,19 @@ void parse_numa_opts(MachineClass *mc)
>>          }
>>          if (i == nb_numa_nodes) {
>>              uint64_t usedmem = 0;
>> -
>> -            /* On Linux, each node's border has to be 8MB aligned,
>> -             * the final node gets the rest.
>> -             */
>> +#if defined(TARGET_PPC64)
>> +            /* pseries requests each node's border has to be 256 MB aligned */
>> +            const uint64_t numa_mem_align_mask = ~((1 << 28UL) - 1);
>> +#else
> 
> but here you're forcing 256 MB alignement for all machine types. This is
> surely breaking machine ABI compat for anyone who previously used -numa
> with QEMU and upgrades to new QEMU expecting the same ABI.

Yes, you're right. I'm going to fix that...

Thanks,
Laurent


Re: [Qemu-devel] [PATCH] numa, spapr: align default numa node memory size to 256MB
Posted by Thomas Huth 7 years, 1 month ago
On 20.03.2017 12:24, Laurent Vivier wrote:
> Since commit 224245b ("spapr: Add LMB DR connectors"), NUMA node
> memory size must be aligned to 256MB (SPAPR_MEMORY_BLOCK_SIZE).
> 
> But when "-numa" option is provided without "mem" parameter,
> the memory is equally divided between nodes, but 8MB aligned.
> This can be not valid for pseries.
> 
> In that case we can have:
> $ ./ppc64-softmmu/qemu-system-ppc64 -m 4G -numa node -numa node -numa node
> qemu-system-ppc64: Node 0 memory size 0x55000000 is not aligned to 256 MiB
> 
> With this patch, we have:
> (qemu) info numa
> 3 nodes
> node 0 cpus: 0
> node 0 size: 1280 MB
> node 1 cpus:
> node 1 size: 1280 MB
> node 2 cpus:
> node 2 size: 1536 MB
> 
> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
> ---
>  dtc    |  2 +-
>  numa.c | 14 +++++++++-----
>  2 files changed, 10 insertions(+), 6 deletions(-)
> 
> diff --git a/dtc b/dtc
> index 558cd81..fa8bc7f 160000
> --- a/dtc
> +++ b/dtc
> @@ -1 +1 @@
> -Subproject commit 558cd81bdd432769b59bff01240c44f82cfb1a9d
> +Subproject commit fa8bc7f928ac25f23532afc8beb2073efc8fb063

Accidential change?

> diff --git a/numa.c b/numa.c
> index e01cb54..a911284 100644
> --- a/numa.c
> +++ b/numa.c
> @@ -337,15 +337,19 @@ void parse_numa_opts(MachineClass *mc)
>          }
>          if (i == nb_numa_nodes) {
>              uint64_t usedmem = 0;
> -
> -            /* On Linux, each node's border has to be 8MB aligned,
> -             * the final node gets the rest.
> -             */
> +#if defined(TARGET_PPC64)
> +            /* pseries requests each node's border has to be 256 MB aligned */
> +            const uint64_t numa_mem_align_mask = ~((1 << 28UL) - 1);
> +#else
> +            /* On Linux, each node's border has to be 8MB aligned */
> +            const uint64_t numa_mem_align_mask = ~((1 << 23UL)  - 1);
> +#endif

With that #if, you enable this for all ppc machines, not just for the
pseries machine. So it might be cleaner to add a setting to the
MachineClass instead...?

>              for (i = 0; i < nb_numa_nodes - 1; i++) {
>                  numa_info[i].node_mem = (ram_size / nb_numa_nodes) &
> -                                        ~((1 << 23UL) - 1);
> +                                        numa_mem_align_mask;
>                  usedmem += numa_info[i].node_mem;
>              }
> +            /* the final node gets the rest. */
>              numa_info[i].node_mem = ram_size - usedmem;
>          }

 Thomas


Re: [Qemu-devel] [PATCH] numa, spapr: align default numa node memory size to 256MB
Posted by Laurent Vivier 7 years, 1 month ago
On 20/03/2017 12:31, Thomas Huth wrote:
> On 20.03.2017 12:24, Laurent Vivier wrote:
>> Since commit 224245b ("spapr: Add LMB DR connectors"), NUMA node
>> memory size must be aligned to 256MB (SPAPR_MEMORY_BLOCK_SIZE).
>>
>> But when "-numa" option is provided without "mem" parameter,
>> the memory is equally divided between nodes, but 8MB aligned.
>> This can be not valid for pseries.
>>
>> In that case we can have:
>> $ ./ppc64-softmmu/qemu-system-ppc64 -m 4G -numa node -numa node -numa node
>> qemu-system-ppc64: Node 0 memory size 0x55000000 is not aligned to 256 MiB
>>
>> With this patch, we have:
>> (qemu) info numa
>> 3 nodes
>> node 0 cpus: 0
>> node 0 size: 1280 MB
>> node 1 cpus:
>> node 1 size: 1280 MB
>> node 2 cpus:
>> node 2 size: 1536 MB
>>
>> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
>> ---
>>  dtc    |  2 +-
>>  numa.c | 14 +++++++++-----
>>  2 files changed, 10 insertions(+), 6 deletions(-)
>>
>> diff --git a/dtc b/dtc
>> index 558cd81..fa8bc7f 160000
>> --- a/dtc
>> +++ b/dtc
>> @@ -1 +1 @@
>> -Subproject commit 558cd81bdd432769b59bff01240c44f82cfb1a9d
>> +Subproject commit fa8bc7f928ac25f23532afc8beb2073efc8fb063
> 
> Accidential change?
> 
>> diff --git a/numa.c b/numa.c
>> index e01cb54..a911284 100644
>> --- a/numa.c
>> +++ b/numa.c
>> @@ -337,15 +337,19 @@ void parse_numa_opts(MachineClass *mc)
>>          }
>>          if (i == nb_numa_nodes) {
>>              uint64_t usedmem = 0;
>> -
>> -            /* On Linux, each node's border has to be 8MB aligned,
>> -             * the final node gets the rest.
>> -             */
>> +#if defined(TARGET_PPC64)
>> +            /* pseries requests each node's border has to be 256 MB aligned */
>> +            const uint64_t numa_mem_align_mask = ~((1 << 28UL) - 1);
>> +#else
>> +            /* On Linux, each node's border has to be 8MB aligned */
>> +            const uint64_t numa_mem_align_mask = ~((1 << 23UL)  - 1);
>> +#endif
> 
> With that #if, you enable this for all ppc machines, not just for the
> pseries machine. So it might be cleaner to add a setting to the
> MachineClass instead...?
> 

I will rework that. But I thought the only PPC64 machine able to boot
with "-numa"  is pseries.

Thanks,
Laurent