The sanity check performed by the nodes_cover_memory function is
also necessary for other architectures that do not yet support NUMA.
But now, the code of nodes_cover_memory is tied to the x86 E820.
In this case, we introduce arch_get_memory_map to decouple
architecture specific memory map from this function. This means,
other architectures like Arm can also use it to check its bootmem
info.
Depending on arch_get_memory_map, we make nodes_cover_memory become
architecture independent. We also use neutral words to replace
SRAT and E820 in the print message of this function. This will
make the message more common.
As arch_get_memory_map use unsigned int for index, we also adjust
the index in nodes_cover_memory from int to unsigned int.
Signed-off-by: Wei Chen <wei.chen@arm.com>
---
v1 -> v2:
1. Use arch_get_memory_map to replace arch_get_memory_bank_range
and arch_get_memory_bank_number.
2. Remove the !start || !end check, because caller guarantee
these two pointers will not be NULL.
---
xen/arch/x86/numa.c | 23 +++++++++++++++++++++++
xen/arch/x86/srat.c | 18 +++++++++++-------
xen/include/xen/numa.h | 3 +++
3 files changed, 37 insertions(+), 7 deletions(-)
diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c
index 193314dbd9..fb235c07ec 100644
--- a/xen/arch/x86/numa.c
+++ b/xen/arch/x86/numa.c
@@ -9,6 +9,7 @@
#include <xen/nodemask.h>
#include <xen/numa.h>
#include <asm/acpi.h>
+#include <asm/e820.h>
#ifndef Dprintk
#define Dprintk(x...)
@@ -82,3 +83,25 @@ unsigned int __init arch_get_dma_bitsize(void)
flsl(node_start_pfn(node) + node_spanned_pages(node) / 4 - 1)
+ PAGE_SHIFT, 32);
}
+
+/*
+ * This function provides the ability for caller to get one RAM entry
+ * from architectural memory map by index.
+ *
+ * This function will return zero if it can return a proper RAM entry.
+ * otherwise it will return -ENODEV for out of scope index, or return
+ * -EINVAL for non-RAM type memory entry.
+ */
+int __init arch_get_memory_map(unsigned int idx, paddr_t *start, paddr_t *end)
+{
+ if ( idx >= e820.nr_map )
+ return -ENODEV;
+
+ if ( e820.map[idx].type != E820_RAM )
+ return -EINVAL;
+
+ *start = e820.map[idx].addr;
+ *end = e820.map[idx].addr + e820.map[idx].size;
+
+ return 0;
+}
diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
index 422e4c73e3..5bc9096a15 100644
--- a/xen/arch/x86/srat.c
+++ b/xen/arch/x86/srat.c
@@ -427,18 +427,22 @@ acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *ma)
Make sure the PXMs cover all memory. */
static int __init nodes_cover_memory(void)
{
- int i;
+ unsigned int i;
- for (i = 0; i < e820.nr_map; i++) {
+ for (i = 0; ; i++) {
int j, found;
paddr_t start, end;
- if (e820.map[i].type != E820_RAM) {
+ /* Try to loop memory map from index 0 to end. */
+ found = arch_get_memory_map(i, &start, &end);
+
+ /* Index relate entry is not RAM, skip it. */
+ if (found == -EINVAL)
continue;
- }
- start = e820.map[i].addr;
- end = e820.map[i].addr + e820.map[i].size;
+ /* Reach the end of arch's memory map */
+ if (found == -ENODEV)
+ break;
do {
found = 0;
@@ -457,7 +461,7 @@ static int __init nodes_cover_memory(void)
} while (found && start < end);
if (start < end) {
- printk(KERN_ERR "SRAT: No PXM for e820 range: "
+ printk(KERN_ERR "NUMA: No node for memory map range: "
"[%"PRIpaddr", %"PRIpaddr"]\n", start, end - 1);
return 0;
}
diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
index 695ad51df0..6d02204991 100644
--- a/xen/include/xen/numa.h
+++ b/xen/include/xen/numa.h
@@ -88,6 +88,9 @@ static inline __attribute__((pure)) nodeid_t phys_to_nid(paddr_t addr)
#define node_end_pfn(nid) (NODE_DATA(nid)->node_start_pfn + \
NODE_DATA(nid)->node_spanned_pages)
+extern int arch_get_memory_map(unsigned int idx,
+ paddr_t *start, paddr_t *end);
+
#endif
#endif /* _XEN_NUMA_H */
--
2.25.1
On 08.07.2022 16:54, Wei Chen wrote:
> @@ -82,3 +83,25 @@ unsigned int __init arch_get_dma_bitsize(void)
> flsl(node_start_pfn(node) + node_spanned_pages(node) / 4 - 1)
> + PAGE_SHIFT, 32);
> }
> +
> +/*
> + * This function provides the ability for caller to get one RAM entry
> + * from architectural memory map by index.
> + *
> + * This function will return zero if it can return a proper RAM entry.
> + * otherwise it will return -ENODEV for out of scope index, or return
> + * -EINVAL for non-RAM type memory entry.
> + */
I think the comment also wants to spell out that the range is
exclusive at the end (assuming that's suitable for Arm; else now
would perhaps be the time to change it).
> +int __init arch_get_memory_map(unsigned int idx, paddr_t *start, paddr_t *end)
> +{
> + if ( idx >= e820.nr_map )
> + return -ENODEV;
Perhaps better -ENOENT?
> + if ( e820.map[idx].type != E820_RAM )
> + return -EINVAL;
I'm sorry, this should have occurred to me already when commenting on
v1: "get_memory_map" doesn't really fit this "RAM only" restriction.
Maybe arch_get_ram_range()? Or maybe others have some good naming
suggestion?
> + *start = e820.map[idx].addr;
> + *end = e820.map[idx].addr + e820.map[idx].size;
Nit: Would be shorter to read if you (re)used *start.
Jan
Hi Jan,
> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: 2022年7月12日 21:40
> To: Wei Chen <Wei.Chen@arm.com>
> Cc: nd <nd@arm.com>; Andrew Cooper <andrew.cooper3@citrix.com>; Roger Pau
> Monné <roger.pau@citrix.com>; Wei Liu <wl@xen.org>; George Dunlap
> <george.dunlap@citrix.com>; Julien Grall <julien@xen.org>; Stefano
> Stabellini <sstabellini@kernel.org>; xen-devel@lists.xenproject.org
> Subject: Re: [PATCH v2 5/9] xen/x86: use arch_get_memory_map to get
> information from E820 map
>
> On 08.07.2022 16:54, Wei Chen wrote:
> > @@ -82,3 +83,25 @@ unsigned int __init arch_get_dma_bitsize(void)
> > flsl(node_start_pfn(node) + node_spanned_pages(node) /
> 4 - 1)
> > + PAGE_SHIFT, 32);
> > }
> > +
> > +/*
> > + * This function provides the ability for caller to get one RAM entry
> > + * from architectural memory map by index.
> > + *
> > + * This function will return zero if it can return a proper RAM entry.
> > + * otherwise it will return -ENODEV for out of scope index, or return
> > + * -EINVAL for non-RAM type memory entry.
> > + */
>
> I think the comment also wants to spell out that the range is
> exclusive at the end (assuming that's suitable for Arm; else now
> would perhaps be the time to change it).
>
Did you mean we have to mention the range is [star, end)?
If yes, I will add related comment. And...
> > +int __init arch_get_memory_map(unsigned int idx, paddr_t *start,
> paddr_t *end)
> > +{
> > + if ( idx >= e820.nr_map )
> > + return -ENODEV;
>
> Perhaps better -ENOENT?
>
Ack.
> > + if ( e820.map[idx].type != E820_RAM )
> > + return -EINVAL;
>
> I'm sorry, this should have occurred to me already when commenting on
> v1: "get_memory_map" doesn't really fit this "RAM only" restriction.
> Maybe arch_get_ram_range()? Or maybe others have some good naming
> suggestion?
arch_get_ram_range makes sense to me. I will update in next version.
>
> > + *start = e820.map[idx].addr;
> > + *end = e820.map[idx].addr + e820.map[idx].size;
>
...In this case, I think we don't need to adjust this line to:
*end = e820.map[idx].addr + e820.map[idx].size - 1;
As we have mentioned the range is [start, end).
> Nit: Would be shorter to read if you (re)used *start.
>
Ack.
> Jan
On 13.07.2022 12:37, Wei Chen wrote: >> From: Jan Beulich <jbeulich@suse.com> >> Sent: 2022年7月12日 21:40 >> >> On 08.07.2022 16:54, Wei Chen wrote: >>> @@ -82,3 +83,25 @@ unsigned int __init arch_get_dma_bitsize(void) >>> flsl(node_start_pfn(node) + node_spanned_pages(node) / >> 4 - 1) >>> + PAGE_SHIFT, 32); >>> } >>> + >>> +/* >>> + * This function provides the ability for caller to get one RAM entry >>> + * from architectural memory map by index. >>> + * >>> + * This function will return zero if it can return a proper RAM entry. >>> + * otherwise it will return -ENODEV for out of scope index, or return >>> + * -EINVAL for non-RAM type memory entry. >>> + */ >> >> I think the comment also wants to spell out that the range is >> exclusive at the end (assuming that's suitable for Arm; else now >> would perhaps be the time to change it). >> > > Did you mean we have to mention the range is [star, end)? > If yes, I will add related comment. Yes (worded one way or another). Jan
© 2016 - 2026 Red Hat, Inc.