[XEN RFC PATCH 38/40] xen/arm: enable device tree based NUMA in system init

Wei Chen posted 40 patches 4 years, 5 months ago
[XEN RFC PATCH 38/40] xen/arm: enable device tree based NUMA in system init
Posted by Wei Chen 4 years, 5 months ago
Everything is ready, we can remove the fake NUMA node and
depends on device tree to create NUMA system.

Signed-off-by: Wei Chen <wei.chen@arm.com>
---
 xen/arch/arm/numa.c        | 45 ++++++++++++++++++++++----------------
 xen/include/asm-arm/numa.h |  7 ------
 2 files changed, 26 insertions(+), 26 deletions(-)

diff --git a/xen/arch/arm/numa.c b/xen/arch/arm/numa.c
index 2a18c97470..3b04220e60 100644
--- a/xen/arch/arm/numa.c
+++ b/xen/arch/arm/numa.c
@@ -18,6 +18,7 @@
  *
  */
 #include <xen/init.h>
+#include <xen/device_tree.h>
 #include <xen/nodemask.h>
 #include <xen/numa.h>
 #include <xen/pfn.h>
@@ -83,28 +84,34 @@ void __init numa_init(bool acpi_off)
     paddr_t ram_size = 0;
     paddr_t ram_end = 0;
 
-    printk(XENLOG_WARNING
-        "NUMA has not been supported yet, NUMA off!\n");
-    /* Arm NUMA has not been implemented until this patch */
-    numa_off = true;
+    /* NUMA has been turned off through Xen parameters */
+    if ( numa_off )
+        goto mem_init;
 
-    /*
-     * Set all cpu_to_node mapping to 0, this will make cpu_to_node
-     * function return 0 as previous fake cpu_to_node API.
-     */
-    for ( idx = 0; idx < NR_CPUS; idx++ )
-        cpu_to_node[idx] = 0;
-
-    /*
-     * Make node_to_cpumask, node_spanned_pages and node_start_pfn
-     * return as previous fake APIs.
-     */
-    for ( idx = 0; idx < MAX_NUMNODES; idx++ ) {
-        node_to_cpumask[idx] = cpu_online_map;
-        node_spanned_pages(idx) = (max_page - mfn_x(first_valid_mfn));
-        node_start_pfn(idx) = (mfn_x(first_valid_mfn));
+    /* Initialize NUMA from device tree when system is not ACPI booted */
+    if ( acpi_off )
+    {
+#ifdef CONFIG_DEVICE_TREE_NUMA
+        int ret = numa_device_tree_init(device_tree_flattened);
+        if ( !ret )
+            goto mem_init;
+        printk(XENLOG_WARNING
+               "Init NUMA from device tree failed, ret=%d\n", ret);
+#else
+        printk(XENLOG_WARNING
+               "CONFIG_DEVICE_TREE_NUMA is not set, NUMA off!\n");
+#endif
+        numa_off = true;
+    }
+    else
+    {
+        /* We don't support NUMA for ACPI boot currently */
+        printk(XENLOG_WARNING
+               "ACPI NUMA has not been supported yet, NUMA off!\n");
+        numa_off = true;
     }
 
+mem_init:
     /*
      * Find the minimal and maximum address of RAM, NUMA will
      * build a memory to node mapping table for the whole range.
diff --git a/xen/include/asm-arm/numa.h b/xen/include/asm-arm/numa.h
index a3982a94b6..425eb9aede 100644
--- a/xen/include/asm-arm/numa.h
+++ b/xen/include/asm-arm/numa.h
@@ -30,13 +30,6 @@ extern int numa_device_tree_init(const void *fdt);
 extern void numa_set_distance(nodeid_t from, nodeid_t to, uint32_t distance);
 extern void arch_numa_init_failed_fallback(void);
 
-/*
- * Temporary for fake NUMA node, when CPU, memory and distance
- * matrix will be read from DTB or ACPI SRAT. The following
- * symbols will be removed.
- */
-extern mfn_t first_valid_mfn;
-
 #else
 
 /* Fake one node for now. See also node_online_map. */
-- 
2.25.1


Re: [XEN RFC PATCH 38/40] xen/arm: enable device tree based NUMA in system init
Posted by Julien Grall 4 years, 5 months ago
Hi Wei,

On 11/08/2021 11:24, Wei Chen wrote:
> Everything is ready, we can remove the fake NUMA node and
> depends on device tree to create NUMA system.

So you just added code a few patches before that are now completely 
rewritten. Can you please re-order this series so it doesn't happen?

This may mean that CONFIG_NUMA is only selected until late in this series.

Cheers,

-- 
Julien Grall

RE: [XEN RFC PATCH 38/40] xen/arm: enable device tree based NUMA in system init
Posted by Wei Chen 4 years, 5 months ago
Hi Julien,

> -----Original Message-----
> From: Julien Grall <julien@xen.org>
> Sent: 2021年8月27日 22:33
> To: Wei Chen <Wei.Chen@arm.com>; xen-devel@lists.xenproject.org;
> sstabellini@kernel.org; jbeulich@suse.com
> Cc: Bertrand Marquis <Bertrand.Marquis@arm.com>
> Subject: Re: [XEN RFC PATCH 38/40] xen/arm: enable device tree based NUMA
> in system init
> 
> Hi Wei,
> 
> On 11/08/2021 11:24, Wei Chen wrote:
> > Everything is ready, we can remove the fake NUMA node and
> > depends on device tree to create NUMA system.
> 
> So you just added code a few patches before that are now completely
> rewritten. Can you please re-order this series so it doesn't happen?
> 
> This may mean that CONFIG_NUMA is only selected until late in this series.
> 

Why I did like this is because my original concerns are:
1. When I introduced the CONFIG_NUMA. Users will be using a code base on
   this commit by accident.
2. If users select CONFIG_NUMA, but not all NUMA data are not initialized
   properly. The system may not work properly.
3. So I created the fake node to initialize the NUMA data, before we parser
   real data from DTB.
4. In this case, user can work well with CONFIG_NUMA is enabled, without
   this series is completed.

It seems I thought too much. If these concerns are not necessary. I am
OK to re-order this series.

> Cheers,
> 
> --
> Julien Grall
Re: [XEN RFC PATCH 38/40] xen/arm: enable device tree based NUMA in system init
Posted by Julien Grall 4 years, 5 months ago

On 28/08/2021 04:17, Wei Chen wrote:
> Hi Julien,

Hi Wei,

> 
>> -----Original Message-----
>> From: Julien Grall <julien@xen.org>
>> Sent: 2021年8月27日 22:33
>> To: Wei Chen <Wei.Chen@arm.com>; xen-devel@lists.xenproject.org;
>> sstabellini@kernel.org; jbeulich@suse.com
>> Cc: Bertrand Marquis <Bertrand.Marquis@arm.com>
>> Subject: Re: [XEN RFC PATCH 38/40] xen/arm: enable device tree based NUMA
>> in system init
>>
>> Hi Wei,
>>
>> On 11/08/2021 11:24, Wei Chen wrote:
>>> Everything is ready, we can remove the fake NUMA node and
>>> depends on device tree to create NUMA system.
>>
>> So you just added code a few patches before that are now completely
>> rewritten. Can you please re-order this series so it doesn't happen?
>>
>> This may mean that CONFIG_NUMA is only selected until late in this series.
>>
> 
> Why I did like this is because my original concerns are:
> 1. When I introduced the CONFIG_NUMA. Users will be using a code base on
>     this commit by accident.
> 2. If users select CONFIG_NUMA, but not all NUMA data are not initialized
>     properly. The system may not work properly.

We have to make sure we don't break any existing use case when writing a 
new feature. However, a user should not expect a new feature to work it 
is using a random commit in the middle of the series.

This is also why I suggested that maybe CONFIG_NUMA is only selected for 
Arm towards the end of the series. So you reduce the risk of someone 
selecting it.

> 3. So I created the fake node to initialize the NUMA data, before we parser
>     real data from DTB.
> 4. In this case, user can work well with CONFIG_NUMA is enabled, without
>     this series is completed.

The flip side is you are adding more load on the reviewers because there 
are extra code. The series is already quite big (40 patches), any way to 
ease the review will definitely be appreciated.

Another possible way to ease the review is to move the patch that 
rework/move code at the beginning of the series and leave the Arm part 
for the second part of the series. This way, we can start to merge the 
series without waiting for the Arm bits to be completed.

Cheers,

-- 
Julien Grall

RE: [XEN RFC PATCH 38/40] xen/arm: enable device tree based NUMA in system init
Posted by Wei Chen 4 years, 5 months ago
Hi Julien,

> -----Original Message-----
> From: Julien Grall <julien@xen.org>
> Sent: 2021年8月28日 18:45
> To: Wei Chen <Wei.Chen@arm.com>; xen-devel@lists.xenproject.org;
> sstabellini@kernel.org
> Cc: Bertrand Marquis <Bertrand.Marquis@arm.com>
> Subject: Re: [XEN RFC PATCH 38/40] xen/arm: enable device tree based NUMA
> in system init
> 
> 
> 
> On 28/08/2021 04:17, Wei Chen wrote:
> > Hi Julien,
> 
> Hi Wei,
> 
> >
> >> -----Original Message-----
> >> From: Julien Grall <julien@xen.org>
> >> Sent: 2021年8月27日 22:33
> >> To: Wei Chen <Wei.Chen@arm.com>; xen-devel@lists.xenproject.org;
> >> sstabellini@kernel.org; jbeulich@suse.com
> >> Cc: Bertrand Marquis <Bertrand.Marquis@arm.com>
> >> Subject: Re: [XEN RFC PATCH 38/40] xen/arm: enable device tree based
> NUMA
> >> in system init
> >>
> >> Hi Wei,
> >>
> >> On 11/08/2021 11:24, Wei Chen wrote:
> >>> Everything is ready, we can remove the fake NUMA node and
> >>> depends on device tree to create NUMA system.
> >>
> >> So you just added code a few patches before that are now completely
> >> rewritten. Can you please re-order this series so it doesn't happen?
> >>
> >> This may mean that CONFIG_NUMA is only selected until late in this
> series.
> >>
> >
> > Why I did like this is because my original concerns are:
> > 1. When I introduced the CONFIG_NUMA. Users will be using a code base on
> >     this commit by accident.
> > 2. If users select CONFIG_NUMA, but not all NUMA data are not
> initialized
> >     properly. The system may not work properly.
> 
> We have to make sure we don't break any existing use case when writing a
> new feature. However, a user should not expect a new feature to work it
> is using a random commit in the middle of the series.
> 
> This is also why I suggested that maybe CONFIG_NUMA is only selected for
> Arm towards the end of the series. So you reduce the risk of someone
> selecting it.
> 

Thanks for this clarification.

> > 3. So I created the fake node to initialize the NUMA data, before we
> parser
> >     real data from DTB.
> > 4. In this case, user can work well with CONFIG_NUMA is enabled, without
> >     this series is completed.
> 
> The flip side is you are adding more load on the reviewers because there
> are extra code. The series is already quite big (40 patches), any way to
> ease the review will definitely be appreciated.
> 
> Another possible way to ease the review is to move the patch that
> rework/move code at the beginning of the series and leave the Arm part
> for the second part of the series. This way, we can start to merge the
> series without waiting for the Arm bits to be completed.
> 

Yes, I will try to re-order the patches in this way in next version.

> Cheers,
> 
> --
> Julien Grall
Re: [XEN RFC PATCH 38/40] xen/arm: enable device tree based NUMA in system init
Posted by Stefano Stabellini 4 years, 5 months ago
On Wed, 11 Aug 2021, Wei Chen wrote:
> Everything is ready, we can remove the fake NUMA node and
> depends on device tree to create NUMA system.
> 
> Signed-off-by: Wei Chen <wei.chen@arm.com>
> ---
>  xen/arch/arm/numa.c        | 45 ++++++++++++++++++++++----------------
>  xen/include/asm-arm/numa.h |  7 ------
>  2 files changed, 26 insertions(+), 26 deletions(-)
> 
> diff --git a/xen/arch/arm/numa.c b/xen/arch/arm/numa.c
> index 2a18c97470..3b04220e60 100644
> --- a/xen/arch/arm/numa.c
> +++ b/xen/arch/arm/numa.c
> @@ -18,6 +18,7 @@
>   *
>   */
>  #include <xen/init.h>
> +#include <xen/device_tree.h>
>  #include <xen/nodemask.h>
>  #include <xen/numa.h>
>  #include <xen/pfn.h>
> @@ -83,28 +84,34 @@ void __init numa_init(bool acpi_off)
>      paddr_t ram_size = 0;
>      paddr_t ram_end = 0;
>  
> -    printk(XENLOG_WARNING
> -        "NUMA has not been supported yet, NUMA off!\n");
> -    /* Arm NUMA has not been implemented until this patch */
> -    numa_off = true;
> +    /* NUMA has been turned off through Xen parameters */
> +    if ( numa_off )
> +        goto mem_init;
>  
> -    /*
> -     * Set all cpu_to_node mapping to 0, this will make cpu_to_node
> -     * function return 0 as previous fake cpu_to_node API.
> -     */
> -    for ( idx = 0; idx < NR_CPUS; idx++ )
> -        cpu_to_node[idx] = 0;
> -
> -    /*
> -     * Make node_to_cpumask, node_spanned_pages and node_start_pfn
> -     * return as previous fake APIs.
> -     */
> -    for ( idx = 0; idx < MAX_NUMNODES; idx++ ) {
> -        node_to_cpumask[idx] = cpu_online_map;
> -        node_spanned_pages(idx) = (max_page - mfn_x(first_valid_mfn));
> -        node_start_pfn(idx) = (mfn_x(first_valid_mfn));
> +    /* Initialize NUMA from device tree when system is not ACPI booted */
> +    if ( acpi_off )
> +    {
> +#ifdef CONFIG_DEVICE_TREE_NUMA
> +        int ret = numa_device_tree_init(device_tree_flattened);
> +        if ( !ret )
> +            goto mem_init;
> +        printk(XENLOG_WARNING
> +               "Init NUMA from device tree failed, ret=%d\n", ret);
> +#else
> +        printk(XENLOG_WARNING
> +               "CONFIG_DEVICE_TREE_NUMA is not set, NUMA off!\n");

I don't think we want to see this warning every time at boot when
CONFIG_DEVICE_TREE_NUMA is off. I'd set it to XENLOG_DEBUG or remove it.

Also given that we have many stub functions in
xen/include/asm-arm/numa.h already, maybe we could also have a stub
function for numa_device_tree_init so that we won'd need an #ifdef
CONFIG_DEVICE_TREE_NUMA here.


> +#endif
> +        numa_off = true;
> +    }
> +    else
> +    {
> +        /* We don't support NUMA for ACPI boot currently */
> +        printk(XENLOG_WARNING
> +               "ACPI NUMA has not been supported yet, NUMA off!\n");
> +        numa_off = true;
>      }
>  
> +mem_init:
>      /*
>       * Find the minimal and maximum address of RAM, NUMA will
>       * build a memory to node mapping table for the whole range.
> diff --git a/xen/include/asm-arm/numa.h b/xen/include/asm-arm/numa.h
> index a3982a94b6..425eb9aede 100644
> --- a/xen/include/asm-arm/numa.h
> +++ b/xen/include/asm-arm/numa.h
> @@ -30,13 +30,6 @@ extern int numa_device_tree_init(const void *fdt);
>  extern void numa_set_distance(nodeid_t from, nodeid_t to, uint32_t distance);
>  extern void arch_numa_init_failed_fallback(void);
>  
> -/*
> - * Temporary for fake NUMA node, when CPU, memory and distance
> - * matrix will be read from DTB or ACPI SRAT. The following
> - * symbols will be removed.
> - */
> -extern mfn_t first_valid_mfn;
> -
>  #else
>  
>  /* Fake one node for now. See also node_online_map. */
> -- 
> 2.25.1
> 

RE: [XEN RFC PATCH 38/40] xen/arm: enable device tree based NUMA in system init
Posted by Wei Chen 4 years, 5 months ago
Hi Stefano,

> -----Original Message-----
> From: Stefano Stabellini <sstabellini@kernel.org>
> Sent: 2021年8月31日 9:51
> To: Wei Chen <Wei.Chen@arm.com>
> Cc: xen-devel@lists.xenproject.org; sstabellini@kernel.org; julien@xen.org;
> jbeulich@suse.com; Bertrand Marquis <Bertrand.Marquis@arm.com>
> Subject: Re: [XEN RFC PATCH 38/40] xen/arm: enable device tree based NUMA
> in system init
> 
> On Wed, 11 Aug 2021, Wei Chen wrote:
> > Everything is ready, we can remove the fake NUMA node and
> > depends on device tree to create NUMA system.
> >
> > Signed-off-by: Wei Chen <wei.chen@arm.com>
> > ---
> >  xen/arch/arm/numa.c        | 45 ++++++++++++++++++++++----------------
> >  xen/include/asm-arm/numa.h |  7 ------
> >  2 files changed, 26 insertions(+), 26 deletions(-)
> >
> > diff --git a/xen/arch/arm/numa.c b/xen/arch/arm/numa.c
> > index 2a18c97470..3b04220e60 100644
> > --- a/xen/arch/arm/numa.c
> > +++ b/xen/arch/arm/numa.c
> > @@ -18,6 +18,7 @@
> >   *
> >   */
> >  #include <xen/init.h>
> > +#include <xen/device_tree.h>
> >  #include <xen/nodemask.h>
> >  #include <xen/numa.h>
> >  #include <xen/pfn.h>
> > @@ -83,28 +84,34 @@ void __init numa_init(bool acpi_off)
> >      paddr_t ram_size = 0;
> >      paddr_t ram_end = 0;
> >
> > -    printk(XENLOG_WARNING
> > -        "NUMA has not been supported yet, NUMA off!\n");
> > -    /* Arm NUMA has not been implemented until this patch */
> > -    numa_off = true;
> > +    /* NUMA has been turned off through Xen parameters */
> > +    if ( numa_off )
> > +        goto mem_init;
> >
> > -    /*
> > -     * Set all cpu_to_node mapping to 0, this will make cpu_to_node
> > -     * function return 0 as previous fake cpu_to_node API.
> > -     */
> > -    for ( idx = 0; idx < NR_CPUS; idx++ )
> > -        cpu_to_node[idx] = 0;
> > -
> > -    /*
> > -     * Make node_to_cpumask, node_spanned_pages and node_start_pfn
> > -     * return as previous fake APIs.
> > -     */
> > -    for ( idx = 0; idx < MAX_NUMNODES; idx++ ) {
> > -        node_to_cpumask[idx] = cpu_online_map;
> > -        node_spanned_pages(idx) = (max_page - mfn_x(first_valid_mfn));
> > -        node_start_pfn(idx) = (mfn_x(first_valid_mfn));
> > +    /* Initialize NUMA from device tree when system is not ACPI booted
> */
> > +    if ( acpi_off )
> > +    {
> > +#ifdef CONFIG_DEVICE_TREE_NUMA
> > +        int ret = numa_device_tree_init(device_tree_flattened);
> > +        if ( !ret )
> > +            goto mem_init;
> > +        printk(XENLOG_WARNING
> > +               "Init NUMA from device tree failed, ret=%d\n", ret);
> > +#else
> > +        printk(XENLOG_WARNING
> > +               "CONFIG_DEVICE_TREE_NUMA is not set, NUMA off!\n");
> 
> I don't think we want to see this warning every time at boot when
> CONFIG_DEVICE_TREE_NUMA is off. I'd set it to XENLOG_DEBUG or remove it.
> 

OK

> Also given that we have many stub functions in
> xen/include/asm-arm/numa.h already, maybe we could also have a stub
> function for numa_device_tree_init so that we won'd need an #ifdef
> CONFIG_DEVICE_TREE_NUMA here.
> 

Yes, it's a good idea. I will do it.

> 
> > +#endif
> > +        numa_off = true;
> > +    }
> > +    else
> > +    {
> > +        /* We don't support NUMA for ACPI boot currently */
> > +        printk(XENLOG_WARNING
> > +               "ACPI NUMA has not been supported yet, NUMA off!\n");
> > +        numa_off = true;
> >      }
> >
> > +mem_init:
> >      /*
> >       * Find the minimal and maximum address of RAM, NUMA will
> >       * build a memory to node mapping table for the whole range.
> > diff --git a/xen/include/asm-arm/numa.h b/xen/include/asm-arm/numa.h
> > index a3982a94b6..425eb9aede 100644
> > --- a/xen/include/asm-arm/numa.h
> > +++ b/xen/include/asm-arm/numa.h
> > @@ -30,13 +30,6 @@ extern int numa_device_tree_init(const void *fdt);
> >  extern void numa_set_distance(nodeid_t from, nodeid_t to, uint32_t
> distance);
> >  extern void arch_numa_init_failed_fallback(void);
> >
> > -/*
> > - * Temporary for fake NUMA node, when CPU, memory and distance
> > - * matrix will be read from DTB or ACPI SRAT. The following
> > - * symbols will be removed.
> > - */
> > -extern mfn_t first_valid_mfn;
> > -
> >  #else
> >
> >  /* Fake one node for now. See also node_online_map. */
> > --
> > 2.25.1
> >