arch/x86/kernel/apic/x2apic_uv_x.c | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-)
When nr_cpus is set to a smaller number than actually present, the
cpu_to_node() mapping information for unused CPUs is not available to
build_socket_tables(). This results in an incomplete table and will
later cause use of a -1 value for some array indexing, and eventual
kernel page faults.
Switch to using the __apicid_to_node array, which still contains all
the information mapping apicids to nodes, even for CPUs disabled with
a reduced nr_cpus setting.
Fixes: 8a50c5851927 ("x86/platform/uv: UV support for sub-NUMA clustering")
Signed-off-by: Steve Wahl <steve.wahl@hpe.com>
---
arch/x86/kernel/apic/x2apic_uv_x.c | 11 +++++------
1 file changed, 5 insertions(+), 6 deletions(-)
This is essentially version 2 of "[PATCH] x86/platform/uv: Abort UV
initialization when reduced nr_cpus requires it". However, the title
was no longer accurate, aborting UV initialization is no longer needed.
Previous discussion can be found here:
https://lore.kernel.org/lkml/20230711202618.85562-1-steve.wahl@hpe.com/
diff --git a/arch/x86/kernel/apic/x2apic_uv_x.c b/arch/x86/kernel/apic/x2apic_uv_x.c
index d9384d5b4b8e..35acc95c6dd5 100644
--- a/arch/x86/kernel/apic/x2apic_uv_x.c
+++ b/arch/x86/kernel/apic/x2apic_uv_x.c
@@ -1571,7 +1571,7 @@ static void __init build_socket_tables(void)
{
struct uv_gam_range_entry *gre = uv_gre_table;
int nums, numn, nump;
- int cpu, i, lnid;
+ int i, lnid, apicid;
int minsock = _min_socket;
int maxsock = _max_socket;
int minpnode = _min_pnode;
@@ -1622,15 +1622,14 @@ static void __init build_socket_tables(void)
/* Set socket -> node values: */
lnid = NUMA_NO_NODE;
- for_each_possible_cpu(cpu) {
- int nid = cpu_to_node(cpu);
- int apicid, sockid;
+ for (apicid = 0; apicid < ARRAY_SIZE(__apicid_to_node); apicid++) {
+ int nid = __apicid_to_node[apicid];
+ int sockid;
- if (lnid == nid)
+ if ((nid == NUMA_NO_NODE) || (lnid == nid))
continue;
lnid = nid;
- apicid = per_cpu(x86_cpu_to_apicid, cpu);
sockid = apicid >> uv_cpuid.socketid_shift;
if (_socket_to_node[sockid - minsock] == SOCK_EMPTY)
--
2.26.2
On 8/7/23 07:17, Steve Wahl wrote: > When nr_cpus is set to a smaller number than actually present, the > cpu_to_node() mapping information for unused CPUs is not available to > build_socket_tables(). This results in an incomplete table and will > later cause use of a -1 value for some array indexing, and eventual > kernel page faults. > > Switch to using the __apicid_to_node array, which still contains all > the information mapping apicids to nodes, even for CPUs disabled with > a reduced nr_cpus setting. Before, the lookup went: CPU => APICID => SOCKET But when the CPU wasn't present, there wasn't a way to start this lookup. So, instead of looping over all CPUs, looking up their APICIDs and mapping those to sockets, just take CPUs out of the equation entirely. Loop over all APICIDs which are mapped to a valid NUMA node. Then just extract the socket-id from the APICID. Right? That seems sane enough. It's also way less code than the previous approach.
On Fri, Sep 01, 2023 at 10:18:20AM -0700, Dave Hansen wrote: > On 8/7/23 07:17, Steve Wahl wrote: > > When nr_cpus is set to a smaller number than actually present, the > > cpu_to_node() mapping information for unused CPUs is not available to > > build_socket_tables(). This results in an incomplete table and will > > later cause use of a -1 value for some array indexing, and eventual > > kernel page faults. > > > > Switch to using the __apicid_to_node array, which still contains all > > the information mapping apicids to nodes, even for CPUs disabled with > > a reduced nr_cpus setting. > > Before, the lookup went: > > CPU => APICID => SOCKET > > But when the CPU wasn't present, there wasn't a way to start this lookup. > > So, instead of looping over all CPUs, looking up their APICIDs and > mapping those to sockets, just take CPUs out of the equation entirely. > > Loop over all APICIDs which are mapped to a valid NUMA node. Then just > extract the socket-id from the APICID. > > Right? > > That seems sane enough. It's also way less code than the previous approach. Yes. That's it precisely. And, yes, way less code. Thanks. --> Steve Wahl -- Steve Wahl, Hewlett Packard Enterprise
On Fri, Sep 01, 2023 at 02:01:52PM -0500, Steve Wahl wrote: > On Fri, Sep 01, 2023 at 10:18:20AM -0700, Dave Hansen wrote: > > On 8/7/23 07:17, Steve Wahl wrote: > > > When nr_cpus is set to a smaller number than actually present, the > > > cpu_to_node() mapping information for unused CPUs is not available to > > > build_socket_tables(). This results in an incomplete table and will > > > later cause use of a -1 value for some array indexing, and eventual > > > kernel page faults. > > > > > > Switch to using the __apicid_to_node array, which still contains all > > > the information mapping apicids to nodes, even for CPUs disabled with > > > a reduced nr_cpus setting. > > > > Before, the lookup went: > > > > CPU => APICID => SOCKET > > > > But when the CPU wasn't present, there wasn't a way to start this lookup. > > > > So, instead of looping over all CPUs, looking up their APICIDs and > > mapping those to sockets, just take CPUs out of the equation entirely. > > > > Loop over all APICIDs which are mapped to a valid NUMA node. Then just > > extract the socket-id from the APICID. > > > > Right? > > > > That seems sane enough. It's also way less code than the previous approach. > > Yes. That's it precisely. And, yes, way less code. Are you willling to give a "Reviewed-by:"? --> Steve -- Steve Wahl, Hewlett Packard Enterprise
On 9/5/23 08:01, Steve Wahl wrote: > On Fri, Sep 01, 2023 at 02:01:52PM -0500, Steve Wahl wrote: >> On Fri, Sep 01, 2023 at 10:18:20AM -0700, Dave Hansen wrote: >>> On 8/7/23 07:17, Steve Wahl wrote: >>>> When nr_cpus is set to a smaller number than actually present, the >>>> cpu_to_node() mapping information for unused CPUs is not available to >>>> build_socket_tables(). This results in an incomplete table and will >>>> later cause use of a -1 value for some array indexing, and eventual >>>> kernel page faults. >>>> >>>> Switch to using the __apicid_to_node array, which still contains all >>>> the information mapping apicids to nodes, even for CPUs disabled with >>>> a reduced nr_cpus setting. >>> Before, the lookup went: >>> >>> CPU => APICID => SOCKET >>> >>> But when the CPU wasn't present, there wasn't a way to start this lookup. >>> >>> So, instead of looping over all CPUs, looking up their APICIDs and >>> mapping those to sockets, just take CPUs out of the equation entirely. >>> >>> Loop over all APICIDs which are mapped to a valid NUMA node. Then just >>> extract the socket-id from the APICID. >>> >>> Right? >>> >>> That seems sane enough. It's also way less code than the previous approach. >> Yes. That's it precisely. And, yes, way less code. > Are you willling to give a "Reviewed-by:"? Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> Does this need a stable@ tag when it gets applied?
On Thu, Sep 07, 2023 at 01:57:16PM -0700, Dave Hansen wrote: > On 9/5/23 08:01, Steve Wahl wrote: > > On Fri, Sep 01, 2023 at 02:01:52PM -0500, Steve Wahl wrote: > >> On Fri, Sep 01, 2023 at 10:18:20AM -0700, Dave Hansen wrote: > >>> On 8/7/23 07:17, Steve Wahl wrote: > >>>> When nr_cpus is set to a smaller number than actually present, the > >>>> cpu_to_node() mapping information for unused CPUs is not available to > >>>> build_socket_tables(). This results in an incomplete table and will > >>>> later cause use of a -1 value for some array indexing, and eventual > >>>> kernel page faults. > >>>> > >>>> Switch to using the __apicid_to_node array, which still contains all > >>>> the information mapping apicids to nodes, even for CPUs disabled with > >>>> a reduced nr_cpus setting. > >>> Before, the lookup went: > >>> > >>> CPU => APICID => SOCKET > >>> > >>> But when the CPU wasn't present, there wasn't a way to start this lookup. > >>> > >>> So, instead of looping over all CPUs, looking up their APICIDs and > >>> mapping those to sockets, just take CPUs out of the equation entirely. > >>> > >>> Loop over all APICIDs which are mapped to a valid NUMA node. Then just > >>> extract the socket-id from the APICID. > >>> > >>> Right? > >>> > >>> That seems sane enough. It's also way less code than the previous approach. > >> Yes. That's it precisely. And, yes, way less code. > > Are you willling to give a "Reviewed-by:"? > > Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com> > > Does this need a stable@ tag when it gets applied? I hadn't thought about that. I think it meets the requirements in stable-kernel-rules.rst. And it looks like it should apply without conflicts or prerequisites. So it probably should. Is there a way to add a cc:stable tag at this point? Thank you, --> Steve Wahl -- Steve Wahl, Hewlett Packard Enterprise
On 9/7/23 14:25, Steve Wahl wrote: >> Does this need a stable@ tag when it gets applied? > I hadn't thought about that. I think it meets the requirements in > stable-kernel-rules.rst. And it looks like it should apply without > conflicts or prerequisites. So it probably should. Is there a way to > add a cc:stable tag at this point? I can do it when I apply it. I just wanted to make sure there wasn't a good reason you left off the cc:stable.
On Thu, Sep 07, 2023 at 02:32:11PM -0700, Dave Hansen wrote: > On 9/7/23 14:25, Steve Wahl wrote: > >> Does this need a stable@ tag when it gets applied? > > I hadn't thought about that. I think it meets the requirements in > > stable-kernel-rules.rst. And it looks like it should apply without > > conflicts or prerequisites. So it probably should. Is there a way to > > add a cc:stable tag at this point? > > I can do it when I apply it. I just wanted to make sure there wasn't a > good reason you left off the cc:stable. No reason, it was an oversight. Thanks, Dave. --> Steve Wahl -- Steve Wahl, Hewlett Packard Enterprise
On Mon, Aug 07, 2023 at 09:17:30AM -0500, Steve Wahl wrote: > When nr_cpus is set to a smaller number than actually present, the > cpu_to_node() mapping information for unused CPUs is not available to > build_socket_tables(). This results in an incomplete table and will > later cause use of a -1 value for some array indexing, and eventual > kernel page faults. > > Switch to using the __apicid_to_node array, which still contains all > the information mapping apicids to nodes, even for CPUs disabled with > a reduced nr_cpus setting. > > Fixes: 8a50c5851927 ("x86/platform/uv: UV support for sub-NUMA clustering") > Signed-off-by: Steve Wahl <steve.wahl@hpe.com> > --- > arch/x86/kernel/apic/x2apic_uv_x.c | 11 +++++------ > 1 file changed, 5 insertions(+), 6 deletions(-) > > This is essentially version 2 of "[PATCH] x86/platform/uv: Abort UV > initialization when reduced nr_cpus requires it". However, the title > was no longer accurate, aborting UV initialization is no longer needed. > > Previous discussion can be found here: > https://lore.kernel.org/lkml/20230711202618.85562-1-steve.wahl@hpe.com/ > > diff --git a/arch/x86/kernel/apic/x2apic_uv_x.c b/arch/x86/kernel/apic/x2apic_uv_x.c > index d9384d5b4b8e..35acc95c6dd5 100644 > --- a/arch/x86/kernel/apic/x2apic_uv_x.c > +++ b/arch/x86/kernel/apic/x2apic_uv_x.c > @@ -1571,7 +1571,7 @@ static void __init build_socket_tables(void) > { > struct uv_gam_range_entry *gre = uv_gre_table; > int nums, numn, nump; > - int cpu, i, lnid; > + int i, lnid, apicid; > int minsock = _min_socket; > int maxsock = _max_socket; > int minpnode = _min_pnode; > @@ -1622,15 +1622,14 @@ static void __init build_socket_tables(void) > > /* Set socket -> node values: */ > lnid = NUMA_NO_NODE; > - for_each_possible_cpu(cpu) { > - int nid = cpu_to_node(cpu); > - int apicid, sockid; > + for (apicid = 0; apicid < ARRAY_SIZE(__apicid_to_node); apicid++) { > + int nid = __apicid_to_node[apicid]; > + int sockid; > > - if (lnid == nid) > + if ((nid == NUMA_NO_NODE) || (lnid == nid)) > continue; > lnid = nid; > > - apicid = per_cpu(x86_cpu_to_apicid, cpu); > sockid = apicid >> uv_cpuid.socketid_shift; > > if (_socket_to_node[sockid - minsock] == SOCK_EMPTY) > -- > 2.26.2 > Gentle ping on this patch? Thanks. --> Steve Wahl -- Steve Wahl, Hewlett Packard Enterprise
© 2016 - 2025 Red Hat, Inc.