VIRTUAL_BUG_ON is an empty macro used in phys_to_nid. This
results in two lines of error-checking code in phys_to_nid
that is not actually working and causing two compilation
errors:
1. error: "MAX_NUMNODES" undeclared (first use in this function).
This is because in the common header file, "MAX_NUMNODES" is
defined after the common header file includes the ARCH header
file, where phys_to_nid has attempted to use "MAX_NUMNODES".
This error was resolved after we moved the phys_to_nid from
x86 ARCH header file to common header file.
2. error: wrong type argument to unary exclamation mark.
This is because, the error-checking code contains !node_data[nid].
But node_data is a data structure variable, it's not a pointer.
So, in this patch, we use ASSERT instead of VIRTUAL_BUG_ON to
enable the two lines of error-checking code. And fix the left
compilation errors by replacing !node_data[nid] to
!node_data[nid].node_spanned_pages. Although NUMA allows one node
can only have CPUs but without any memory. And node with 0 bytes
of memory might have an entry in memnodemap[] theoretically. But
that doesn't mean phys_to_nid can find any valid address from a
node with 0 bytes memory.
Signed-off-by: Wei Chen <wei.chen@arm.com>
Tested-by: Jiamei Xie <jiamei.xie@arm.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
v8 -> v9:
1. No change.
v7 -> v8:
1. No change.
v6 -> v7:
1. No change.
v5 -> v6:
1. No change.
v4 -> v5:
1. No change.
v3 -> v4:
1. No change.
v2 -> v3:
1. Remove unnecessary change items in history.
2. Add Acked-by.
v1 -> v2:
1. Use ASSERT to replace VIRTUAL_BUG_ON in phys_to_nid.
2. Adjust the conditional express for ASSERT.
3. Refine the justification of using !node_data[nid].node_spanned_pages.
---
xen/include/xen/numa.h | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
index 5b3877344b..04556f3a6f 100644
--- a/xen/include/xen/numa.h
+++ b/xen/include/xen/numa.h
@@ -35,8 +35,6 @@ struct node {
extern int compute_hash_shift(const struct node *nodes,
unsigned int numnodes, const nodeid_t *nodeids);
-#define VIRTUAL_BUG_ON(x)
-
extern bool numa_off;
extern void numa_add_cpu(unsigned int cpu);
@@ -69,9 +67,9 @@ extern struct node_data node_data[];
static inline nodeid_t __attribute_pure__ phys_to_nid(paddr_t addr)
{
nodeid_t nid;
- VIRTUAL_BUG_ON((paddr_to_pdx(addr) >> memnode_shift) >= memnodemapsize);
+ ASSERT((paddr_to_pdx(addr) >> memnode_shift) < memnodemapsize);
nid = memnodemap[paddr_to_pdx(addr) >> memnode_shift];
- VIRTUAL_BUG_ON(nid >= MAX_NUMNODES || !node_data[nid]);
+ ASSERT(nid < MAX_NUMNODES && node_data[nid].node_spanned_pages);
return nid;
}
--
2.25.1
On 18.11.2022 11:45, Wei Chen wrote:
> VIRTUAL_BUG_ON is an empty macro used in phys_to_nid. This
> results in two lines of error-checking code in phys_to_nid
> that is not actually working and causing two compilation
> errors:
> 1. error: "MAX_NUMNODES" undeclared (first use in this function).
> This is because in the common header file, "MAX_NUMNODES" is
> defined after the common header file includes the ARCH header
> file, where phys_to_nid has attempted to use "MAX_NUMNODES".
> This error was resolved after we moved the phys_to_nid from
> x86 ARCH header file to common header file.
> 2. error: wrong type argument to unary exclamation mark.
> This is because, the error-checking code contains !node_data[nid].
> But node_data is a data structure variable, it's not a pointer.
>
> So, in this patch, we use ASSERT instead of VIRTUAL_BUG_ON to
> enable the two lines of error-checking code. And fix the left
> compilation errors by replacing !node_data[nid] to
> !node_data[nid].node_spanned_pages. Although NUMA allows one node
> can only have CPUs but without any memory. And node with 0 bytes
> of memory might have an entry in memnodemap[] theoretically. But
> that doesn't mean phys_to_nid can find any valid address from a
> node with 0 bytes memory.
>
> Signed-off-by: Wei Chen <wei.chen@arm.com>
> Tested-by: Jiamei Xie <jiamei.xie@arm.com>
> Acked-by: Jan Beulich <jbeulich@suse.com>
This patch is what is causing the regression found by osstest. The
previously silent (dead) checks no trigger when paging_init()
encounters a hole in SRAT-described space, as is the case e.g. on
the himrods:
(XEN) NUMA: Node 0 PXM 0 [0000000000000000, 000000007fffffff]
(XEN) NUMA: Node 0 PXM 0 [0000000100000000, 000000087fffffff]
(XEN) NUMA: Node 1 PXM 1 [0000000880000000, 000000107fffffff]
(XEN) NUMA: Using 19 for the hash shift
The node ID for 0x80000000 (slot 1 of memnodemap[]) is NUMA_NO_NODE,
which of course fails the left side of ...
> @@ -69,9 +67,9 @@ extern struct node_data node_data[];
> static inline nodeid_t __attribute_pure__ phys_to_nid(paddr_t addr)
> {
> nodeid_t nid;
> - VIRTUAL_BUG_ON((paddr_to_pdx(addr) >> memnode_shift) >= memnodemapsize);
> + ASSERT((paddr_to_pdx(addr) >> memnode_shift) < memnodemapsize);
> nid = memnodemap[paddr_to_pdx(addr) >> memnode_shift];
> - VIRTUAL_BUG_ON(nid >= MAX_NUMNODES || !node_data[nid]);
> + ASSERT(nid < MAX_NUMNODES && node_data[nid].node_spanned_pages);
... the && here. As I don't think the use of phys_to_nid() by
paging_init() is outright invalid, I would conclude that the
condition needs to be relaxed to permit for NUMA_NO_NODE. Otoh it's
possible that the function was really intended to never return
NUMA_NO_NODE (and only ever be used on valid memory ranges), in
which case paging_init() would need to use something else (or
make the call conditional): According to my audit all uses except
the two in paging_init() are on (supposedly) valid addresses only
(commonly when already holding in hands a valid struct page_info).
Then again us having phys_to_nid() in the first place is somewhat
bogus: No caller really starts from a physical address. It would
be quite a bit more sensible to have page_to_nid() and mfn_to_nid(),
the more that what phys_to_nid() does is pass the address to
paddr_to_pdx() (undoing every caller's to-addr conversion). At which
point the former could do strict checking (disallowing NUMA_NO_NODE
output) while the latter could be more relaxed. I guess I'll make a
patch along these lines.
Jan
Hi Jan,
> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: 2022年12月13日 17:47
> To: Wei Chen <Wei.Chen@arm.com>
> Cc: nd <nd@arm.com>; Andrew Cooper <andrew.cooper3@citrix.com>; George
> Dunlap <george.dunlap@citrix.com>; Julien Grall <julien@xen.org>; Stefano
> Stabellini <sstabellini@kernel.org>; Wei Liu <wl@xen.org>; Jiamei Xie
> <Jiamei.Xie@arm.com>; xen-devel@lists.xenproject.org; Roger Pau Monné
> <roger.pau@citrix.com>
> Subject: Re: [PATCH v9 3/6] xen/x86: Use ASSERT instead of VIRTUAL_BUG_ON
> for phys_to_nid
>
> On 18.11.2022 11:45, Wei Chen wrote:
> > VIRTUAL_BUG_ON is an empty macro used in phys_to_nid. This
> > results in two lines of error-checking code in phys_to_nid
> > that is not actually working and causing two compilation
> > errors:
> > 1. error: "MAX_NUMNODES" undeclared (first use in this function).
> > This is because in the common header file, "MAX_NUMNODES" is
> > defined after the common header file includes the ARCH header
> > file, where phys_to_nid has attempted to use "MAX_NUMNODES".
> > This error was resolved after we moved the phys_to_nid from
> > x86 ARCH header file to common header file.
> > 2. error: wrong type argument to unary exclamation mark.
> > This is because, the error-checking code contains !node_data[nid].
> > But node_data is a data structure variable, it's not a pointer.
> >
> > So, in this patch, we use ASSERT instead of VIRTUAL_BUG_ON to
> > enable the two lines of error-checking code. And fix the left
> > compilation errors by replacing !node_data[nid] to
> > !node_data[nid].node_spanned_pages. Although NUMA allows one node
> > can only have CPUs but without any memory. And node with 0 bytes
> > of memory might have an entry in memnodemap[] theoretically. But
> > that doesn't mean phys_to_nid can find any valid address from a
> > node with 0 bytes memory.
> >
> > Signed-off-by: Wei Chen <wei.chen@arm.com>
> > Tested-by: Jiamei Xie <jiamei.xie@arm.com>
> > Acked-by: Jan Beulich <jbeulich@suse.com>
>
> This patch is what is causing the regression found by osstest. The
> previously silent (dead) checks no trigger when paging_init()
> encounters a hole in SRAT-described space, as is the case e.g. on
> the himrods:
>
> (XEN) NUMA: Node 0 PXM 0 [0000000000000000, 000000007fffffff]
> (XEN) NUMA: Node 0 PXM 0 [0000000100000000, 000000087fffffff]
> (XEN) NUMA: Node 1 PXM 1 [0000000880000000, 000000107fffffff]
> (XEN) NUMA: Using 19 for the hash shift
>
> The node ID for 0x80000000 (slot 1 of memnodemap[]) is NUMA_NO_NODE,
> which of course fails the left side of ...
>
I am sorry, I had not triggered this in my testing machine. I'm a bit
curious, on NUMA platforms will there be memory that doesn't belong to
any node? If not, why the space of 0x80000000 will be used in paging_init?
> > @@ -69,9 +67,9 @@ extern struct node_data node_data[];
> > static inline nodeid_t __attribute_pure__ phys_to_nid(paddr_t addr)
> > {
> > nodeid_t nid;
> > - VIRTUAL_BUG_ON((paddr_to_pdx(addr) >> memnode_shift) >=
> memnodemapsize);
> > + ASSERT((paddr_to_pdx(addr) >> memnode_shift) < memnodemapsize);
> > nid = memnodemap[paddr_to_pdx(addr) >> memnode_shift];
> > - VIRTUAL_BUG_ON(nid >= MAX_NUMNODES || !node_data[nid]);
> > + ASSERT(nid < MAX_NUMNODES && node_data[nid].node_spanned_pages);
>
> ... the && here. As I don't think the use of phys_to_nid() by
> paging_init() is outright invalid, I would conclude that the
> condition needs to be relaxed to permit for NUMA_NO_NODE. Otoh it's
> possible that the function was really intended to never return
> NUMA_NO_NODE (and only ever be used on valid memory ranges), in
> which case paging_init() would need to use something else (or
> make the call conditional): According to my audit all uses except
> the two in paging_init() are on (supposedly) valid addresses only
> (commonly when already holding in hands a valid struct page_info).
>
> Then again us having phys_to_nid() in the first place is somewhat
> bogus: No caller really starts from a physical address. It would
> be quite a bit more sensible to have page_to_nid() and mfn_to_nid(),
> the more that what phys_to_nid() does is pass the address to
> paddr_to_pdx() (undoing every caller's to-addr conversion). At which
> point the former could do strict checking (disallowing NUMA_NO_NODE
> output) while the latter could be more relaxed. I guess I'll make a
> patch along these lines.
>
Thanks, I will review it after you sent that patch.
Cheers,
Wei Chen
> Jan
On 13.12.2022 11:40, Wei Chen wrote: > Hi Jan, > >> -----Original Message----- >> From: Jan Beulich <jbeulich@suse.com> >> Sent: 2022年12月13日 17:47 >> To: Wei Chen <Wei.Chen@arm.com> >> Cc: nd <nd@arm.com>; Andrew Cooper <andrew.cooper3@citrix.com>; George >> Dunlap <george.dunlap@citrix.com>; Julien Grall <julien@xen.org>; Stefano >> Stabellini <sstabellini@kernel.org>; Wei Liu <wl@xen.org>; Jiamei Xie >> <Jiamei.Xie@arm.com>; xen-devel@lists.xenproject.org; Roger Pau Monné >> <roger.pau@citrix.com> >> Subject: Re: [PATCH v9 3/6] xen/x86: Use ASSERT instead of VIRTUAL_BUG_ON >> for phys_to_nid >> >> On 18.11.2022 11:45, Wei Chen wrote: >>> VIRTUAL_BUG_ON is an empty macro used in phys_to_nid. This >>> results in two lines of error-checking code in phys_to_nid >>> that is not actually working and causing two compilation >>> errors: >>> 1. error: "MAX_NUMNODES" undeclared (first use in this function). >>> This is because in the common header file, "MAX_NUMNODES" is >>> defined after the common header file includes the ARCH header >>> file, where phys_to_nid has attempted to use "MAX_NUMNODES". >>> This error was resolved after we moved the phys_to_nid from >>> x86 ARCH header file to common header file. >>> 2. error: wrong type argument to unary exclamation mark. >>> This is because, the error-checking code contains !node_data[nid]. >>> But node_data is a data structure variable, it's not a pointer. >>> >>> So, in this patch, we use ASSERT instead of VIRTUAL_BUG_ON to >>> enable the two lines of error-checking code. And fix the left >>> compilation errors by replacing !node_data[nid] to >>> !node_data[nid].node_spanned_pages. Although NUMA allows one node >>> can only have CPUs but without any memory. And node with 0 bytes >>> of memory might have an entry in memnodemap[] theoretically. But >>> that doesn't mean phys_to_nid can find any valid address from a >>> node with 0 bytes memory. >>> >>> Signed-off-by: Wei Chen <wei.chen@arm.com> >>> Tested-by: Jiamei Xie <jiamei.xie@arm.com> >>> Acked-by: Jan Beulich <jbeulich@suse.com> >> >> This patch is what is causing the regression found by osstest. The >> previously silent (dead) checks no trigger when paging_init() >> encounters a hole in SRAT-described space, as is the case e.g. on >> the himrods: >> >> (XEN) NUMA: Node 0 PXM 0 [0000000000000000, 000000007fffffff] >> (XEN) NUMA: Node 0 PXM 0 [0000000100000000, 000000087fffffff] >> (XEN) NUMA: Node 1 PXM 1 [0000000880000000, 000000107fffffff] >> (XEN) NUMA: Using 19 for the hash shift >> >> The node ID for 0x80000000 (slot 1 of memnodemap[]) is NUMA_NO_NODE, >> which of course fails the left side of ... >> > > I am sorry, I had not triggered this in my testing machine. I'm a bit > curious, on NUMA platforms will there be memory that doesn't belong to > any node? If not, why the space of 0x80000000 will be used in paging_init? The space isn't used (see the full memory map in the log collected by osstest), but the function is called early to calculate memflags (which then isn't used in that particular loop iteration). Jan
© 2016 - 2026 Red Hat, Inc.