From nobody Thu Apr 2 10:57:43 2026 Received: from mail-wm1-f50.google.com (mail-wm1-f50.google.com [209.85.128.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0EC11175A94 for ; Sun, 29 Mar 2026 12:08:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.50 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774786135; cv=none; b=MyPi4pKMsOmZzgx/8uC+5VURpRFizfNmiriA7VxDS8nkcZwCs6MOJlDsQWvvGlIeai9GTAx6+r5Q67fELMuz+3t3F2zu+OsFUOAJWiTXIVonMZRTXXl0nDGeC00lAFpx26VKNTRfDVXWnxnR8P9O04Ru0Cj/8LQo6R4R4FRlQyE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774786135; c=relaxed/simple; bh=hVEobOA+FR+KrMyilIjx8zc85rK1AUIteJBjPb9lc6U=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=HRJ6820BRHJebFGHzr94pe8SCoy+YYH+P2OMHbsFikQ4M7OGX6q9NiXURaqDLkHUvOiwFClzwr6X08MPZuRHS2bfqBz5l91jONCRJvkT6RvrzKfjHidud+abD2vG8yzJMZx+7RRcZeyXWF3ryJWgq5FeHRmUZBWclmZAavCgJJM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=BpmTIvam; arc=none smtp.client-ip=209.85.128.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="BpmTIvam" Received: by mail-wm1-f50.google.com with SMTP id 5b1f17b1804b1-486ff201041so34571455e9.1 for ; Sun, 29 Mar 2026 05:08:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1774786132; x=1775390932; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=UvMerVgRwMaz5ZGr5bLKist6l8Ir+cAe8d/h0pqxTXo=; b=BpmTIvamK2HdI3LcBCcvcdh490Tocg30/4S4rWNcbz++xgZn42cWradMj7RdfMU7Gp H9pWJWoPvf3WP1zoZv2hGUvfN9wIpFIcpTeJdMTXDvb7jIvgpjNSpnecXiVvV1/oweBt AnXdeg7VaKEju/fO9tusQrfUNPBmKX2alReJczbQO8IL7SiwGcfWh3F3h/yduJpFfXW5 ehVgSvLcLES593N8MxnqMqOQukVXZHq9+E3YDjlK4zkEt+xgp1+7qmbiTuWYVF8mfwKu toMzzPAFoUlVM4X1YrBanf8+ti0FsXvp+tufJYC5zXnuIE18wQmlJWhhdgOCMZytJJkX UbKg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1774786132; x=1775390932; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=UvMerVgRwMaz5ZGr5bLKist6l8Ir+cAe8d/h0pqxTXo=; b=VqxNs9e7o/hGRmepjLaoPZ0zCKUmEa2LM9rExxDgrjr4ojRpZ6guhgqZM+v/3SUXnM jG8dBdU9e3Ib3HHencwcAgd9MWEfs+aTiM/l4Ny7nFUSKJMiR38H7Fm9ysI3WEy0xcOy Q/cnwewOiWZ95G+LUAZamR3bOT39K4iSCshN82f7PALyRuRV/KGGbcshrEbZ95dNw01R hD5U8o5QTuAK8UvVk+GIGkbZS3/ZP+BHHKk5OADLA/rIkDv6ng20sDRGhgo0e2Mkn4Eh TZTQXMfIvQElKyihZQTm9EZDHU5DtULi6Ykl125a1qcrlYxcdFt15JYh1eJ8kk+lSbEk ky8w== X-Gm-Message-State: AOJu0YzQIBHbh3ZUNVMfliGsP9luN6FuxrFjxeu/S34jIROe5xmUHdk5 NMQu3AJFP9nRN+FFPHobOBw4QbIuwIvOj8J041yt+payOrmLUHUPeA9y X-Gm-Gg: ATEYQzxBCjOTZFNIyEdyElVLoD+PEG9lG//bUIkAXQvnEJA0nrCW2lIYbcW402YltQ5 ehphTWFKo0P7zH6+SGa8cmFy2777OaKlJx++pLUigD1tKWb6i8DBuSCuWtcyxfOFu6EiupzF3g5 RoYQHiqVbcG1XpJev/gIsBEtFizQkx35bP52Ap9Z/I7O8lzEItb4fLhmsSslm/RVpneQGh9hiw/ 37XdPTLtBYQ9lZCcWKMQao7b1KK6GUWJx1hCVdXGu8cUTjlTNNf5gyRYdfgoSr/cKK3Cwo/W0ng yx1Sm194podgM17odB90RZinVZbYygGWfVwXJDxMqaTUbPZGr1eFmB1cOvMpcpqqw/COHby5tpt GNV7Tc0pKMz9s9affCmNBf6tjUi5nkBfqyMnsHBp6hZVHpxKgvUgl9acgQgzLsxII5/0cmTGnie Y4jvTISoqyzdwW+d/D5QX5w/E+7enpEjpcKg== X-Received: by 2002:a05:600c:a31b:b0:487:716:2fa2 with SMTP id 5b1f17b1804b1-48727ec3c1dmr124163865e9.16.1774786132230; Sun, 29 Mar 2026 05:08:52 -0700 (PDT) Received: from fedora-dev ([2a01:5a8:304:153c:d983:1bac:a686:ee59]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-43cf21e2a7asm12140252f8f.7.2026.03.29.05.08.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 29 Mar 2026 05:08:51 -0700 (PDT) From: "Nikola Z. Ivanov" To: tglx@kernel.org, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, puwen@hygon.cn, peterz@infradead.org, mario.limonciello@amd.com, yazen.ghannam@amd.com, andrew.cooper3@citrix.com, kai.huang@intel.com, i@rong.moe, pawan.kumar.gupta@linux.intel.com, xin@zytor.com, darwi@linutronix.de, sohil.mehta@intel.com, suchitkarunakaran@gmail.com, sshegde@linux.ibm.com, kprateek.nayak@amd.com, yury.norov@gmail.com, ricardo.neri-calderon@linux.intel.com Cc: linux-kernel@vger.kernel.org, x86@kernel.org, "Nikola Z. Ivanov" Subject: [RFC PATCH] x86/topo: Unify srat_detect_node among amd/intel/hygon Date: Sun, 29 Mar 2026 15:08:41 +0300 Message-ID: <20260329120841.2118684-1-zlatistiv@gmail.com> X-Mailer: git-send-email 2.53.0 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" This change is provoked by an observed warning after commit 717b64d58cff ("x86/topo: Replace x86_has_numa_in_package") when faking numa nodes on intel. For example: qemu-system-x86_64 \ -kernel arch/x86/boot/bzImage \ -append "console=3DttyS0 root=3D/dev/sda debug numa=3Dfake=3D2" \ -hda $IMAGES/unstable.img \ -cpu qemu64,vendor=3DGenuineIntel \ -nographic \ -m 2G \ -smp 2 \ Will trigger: [ 0.066755][ T0] ------------[ cut here ]------------ [ 0.066755][ T0] WARNING: arch/x86/kernel/smpboot.c:698 at set_cpu_sibling_map+0xe41/0x1f90, CPU#1: swapper/1/0 [ 0.066755][ T0] Call Trace: [ 0.066755][ T0] [ 0.066755][ T0] ap_starting+0x9e/0x140 [ 0.066755][ T0] ? __pfx_ap_starting+0x10/0x10 [ 0.066755][ T0] ? fpu__init_cpu_xstate+0x5c/0x320 [ 0.066755][ T0] start_secondary+0x66/0x110 [ 0.066755][ T0] common_startup_64+0x13e/0x147 [ 0.066755][ T0] smpboot.c suggests that the topology is invalid as the CPUs are in the same package but different nodes. Fix this by unifying the srat_detect_node function among amd/intel/hygon and taking the amd/hygon approach of falling back to LLC when SRAT is not detected. Place the function inside common.c and expose it in topology.h The hygon code is already basically identical to amd except for the way it obtains the LLC ID. We can reuse that from the hygon code since we already have the struct cpuinfo_x86 passed to us. Signed-off-by: Nikola Z. Ivanov --- This is marked RFC as I lack the context for the reason why the intel code looks the way it does. I can see it went through a few changes in the 2008-2010 year range, which makes be believe that the comment regarding "not doing AMD heuristics for now" is long overdue. Also is a merge like this even desired in the first place? Any feedback is appreciated! arch/x86/kernel/cpu/amd.c | 74 ------------------------------------ arch/x86/kernel/cpu/common.c | 74 ++++++++++++++++++++++++++++++++++++ arch/x86/kernel/cpu/hygon.c | 73 ----------------------------------- arch/x86/kernel/cpu/intel.c | 17 --------- include/linux/topology.h | 1 + 5 files changed, 75 insertions(+), 164 deletions(-) diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index 09de584e4c8f..7a4c804e6836 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -276,80 +276,6 @@ static void init_amd_k7(struct cpuinfo_x86 *c) #endif } =20 -#ifdef CONFIG_NUMA -/* - * To workaround broken NUMA config. Read the comment in - * srat_detect_node(). - */ -static int nearby_node(int apicid) -{ - int i, node; - - for (i =3D apicid - 1; i >=3D 0; i--) { - node =3D __apicid_to_node[i]; - if (node !=3D NUMA_NO_NODE && node_online(node)) - return node; - } - for (i =3D apicid + 1; i < MAX_LOCAL_APIC; i++) { - node =3D __apicid_to_node[i]; - if (node !=3D NUMA_NO_NODE && node_online(node)) - return node; - } - return first_node(node_online_map); /* Shouldn't happen */ -} -#endif - -static void srat_detect_node(struct cpuinfo_x86 *c) -{ -#ifdef CONFIG_NUMA - int cpu =3D smp_processor_id(); - int node; - unsigned apicid =3D c->topo.apicid; - - node =3D numa_cpu_node(cpu); - if (node =3D=3D NUMA_NO_NODE) - node =3D per_cpu_llc_id(cpu); - - /* - * On multi-fabric platform (e.g. Numascale NumaChip) a - * platform-specific handler needs to be called to fixup some - * IDs of the CPU. - */ - if (x86_cpuinit.fixup_cpu_id) - x86_cpuinit.fixup_cpu_id(c, node); - - if (!node_online(node)) { - /* - * Two possibilities here: - * - * - The CPU is missing memory and no node was created. In - * that case try picking one from a nearby CPU. - * - * - The APIC IDs differ from the HyperTransport node IDs - * which the K8 northbridge parsing fills in. Assume - * they are all increased by a constant offset, but in - * the same order as the HT nodeids. If that doesn't - * result in a usable node fall back to the path for the - * previous case. - * - * This workaround operates directly on the mapping between - * APIC ID and NUMA node, assuming certain relationship - * between APIC ID, HT node ID and NUMA topology. As going - * through CPU mapping may alter the outcome, directly - * access __apicid_to_node[]. - */ - int ht_nodeid =3D c->topo.initial_apicid; - - if (__apicid_to_node[ht_nodeid] !=3D NUMA_NO_NODE) - node =3D __apicid_to_node[ht_nodeid]; - /* Pick a nearby node */ - if (!node_online(node)) - node =3D nearby_node(apicid); - } - numa_set_node(cpu, node); -#endif -} - static void bsp_determine_snp(struct cpuinfo_x86 *c) { #ifdef CONFIG_ARCH_HAS_CC_PLATFORM diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index a8ff4376c286..05fcfa7a5cb5 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -2496,6 +2496,80 @@ void cpu_init(void) load_fixmap_gdt(cpu); } =20 +#ifdef CONFIG_NUMA +/* + * To workaround broken NUMA config. Read the comment in + * srat_detect_node(). + */ +static int nearby_node(int apicid) +{ + int i, node; + + for (i =3D apicid - 1; i >=3D 0; i--) { + node =3D __apicid_to_node[i]; + if (node !=3D NUMA_NO_NODE && node_online(node)) + return node; + } + for (i =3D apicid + 1; i < MAX_LOCAL_APIC; i++) { + node =3D __apicid_to_node[i]; + if (node !=3D NUMA_NO_NODE && node_online(node)) + return node; + } + return first_node(node_online_map); /* Shouldn't happen */ +} +#endif + +void srat_detect_node(struct cpuinfo_x86 *c) +{ +#ifdef CONFIG_NUMA + int cpu =3D smp_processor_id(); + int node; + unsigned int apicid =3D c->topo.apicid; + + node =3D numa_cpu_node(cpu); + if (node =3D=3D NUMA_NO_NODE) + node =3D c->topo.llc_id; + + /* + * On multi-fabric platform (e.g. Numascale NumaChip) a + * platform-specific handler needs to be called to fixup some + * IDs of the CPU. + */ + if (x86_cpuinit.fixup_cpu_id) + x86_cpuinit.fixup_cpu_id(c, node); + + if (!node_online(node)) { + /* + * Two possibilities here: + * + * - The CPU is missing memory and no node was created. In + * that case try picking one from a nearby CPU. + * + * - The APIC IDs differ from the HyperTransport node IDs + * which the K8 northbridge parsing fills in. Assume + * they are all increased by a constant offset, but in + * the same order as the HT nodeids. If that doesn't + * result in a usable node fall back to the path for the + * previous case. + * + * This workaround operates directly on the mapping between + * APIC ID and NUMA node, assuming certain relationship + * between APIC ID, HT node ID and NUMA topology. As going + * through CPU mapping may alter the outcome, directly + * access __apicid_to_node[]. + */ + int ht_nodeid =3D c->topo.initial_apicid; + + if (__apicid_to_node[ht_nodeid] !=3D NUMA_NO_NODE) + node =3D __apicid_to_node[ht_nodeid]; + /* Pick a nearby node */ + if (!node_online(node)) + node =3D nearby_node(apicid); + } + numa_set_node(cpu, node); +#endif +} + #ifdef CONFIG_MICROCODE_LATE_LOADING /** * store_cpu_caps() - Store a snapshot of CPU capabilities diff --git a/arch/x86/kernel/cpu/hygon.c b/arch/x86/kernel/cpu/hygon.c index 7f95a74e4c65..a33735094843 100644 --- a/arch/x86/kernel/cpu/hygon.c +++ b/arch/x86/kernel/cpu/hygon.c @@ -20,79 +20,6 @@ =20 #include "cpu.h" =20 -#ifdef CONFIG_NUMA -/* - * To workaround broken NUMA config. Read the comment in - * srat_detect_node(). - */ -static int nearby_node(int apicid) -{ - int i, node; - - for (i =3D apicid - 1; i >=3D 0; i--) { - node =3D __apicid_to_node[i]; - if (node !=3D NUMA_NO_NODE && node_online(node)) - return node; - } - for (i =3D apicid + 1; i < MAX_LOCAL_APIC; i++) { - node =3D __apicid_to_node[i]; - if (node !=3D NUMA_NO_NODE && node_online(node)) - return node; - } - return first_node(node_online_map); /* Shouldn't happen */ -} -#endif - -static void srat_detect_node(struct cpuinfo_x86 *c) -{ -#ifdef CONFIG_NUMA - int cpu =3D smp_processor_id(); - int node; - unsigned int apicid =3D c->topo.apicid; - - node =3D numa_cpu_node(cpu); - if (node =3D=3D NUMA_NO_NODE) - node =3D c->topo.llc_id; - - /* - * On multi-fabric platform (e.g. Numascale NumaChip) a - * platform-specific handler needs to be called to fixup some - * IDs of the CPU. - */ - if (x86_cpuinit.fixup_cpu_id) - x86_cpuinit.fixup_cpu_id(c, node); - - if (!node_online(node)) { - /* - * Two possibilities here: - * - * - The CPU is missing memory and no node was created. In - * that case try picking one from a nearby CPU. - * - * - The APIC IDs differ from the HyperTransport node IDs. - * Assume they are all increased by a constant offset, but - * in the same order as the HT nodeids. If that doesn't - * result in a usable node fall back to the path for the - * previous case. - * - * This workaround operates directly on the mapping between - * APIC ID and NUMA node, assuming certain relationship - * between APIC ID, HT node ID and NUMA topology. As going - * through CPU mapping may alter the outcome, directly - * access __apicid_to_node[]. - */ - int ht_nodeid =3D c->topo.initial_apicid; - - if (__apicid_to_node[ht_nodeid] !=3D NUMA_NO_NODE) - node =3D __apicid_to_node[ht_nodeid]; - /* Pick a nearby node */ - if (!node_online(node)) - node =3D nearby_node(apicid); - } - numa_set_node(cpu, node); -#endif -} - static void bsp_init_hygon(struct cpuinfo_x86 *c) { if (cpu_has(c, X86_FEATURE_CONSTANT_TSC)) { diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c index 646ff33c4651..12eeacb0de4b 100644 --- a/arch/x86/kernel/cpu/intel.c +++ b/arch/x86/kernel/cpu/intel.c @@ -467,23 +467,6 @@ static void intel_workarounds(struct cpuinfo_x86 *c) } #endif =20 -static void srat_detect_node(struct cpuinfo_x86 *c) -{ -#ifdef CONFIG_NUMA - unsigned node; - int cpu =3D smp_processor_id(); - - /* Don't do the funky fallback heuristics the AMD version employs - for now. */ - node =3D numa_cpu_node(cpu); - if (node =3D=3D NUMA_NO_NODE || !node_online(node)) { - /* reuse the value from init_cpu_to_node() */ - node =3D cpu_to_node(cpu); - } - numa_set_node(cpu, node); -#endif -} - static void init_cpuid_fault(struct cpuinfo_x86 *c) { u64 msr; diff --git a/include/linux/topology.h b/include/linux/topology.h index 6575af39fd10..9f71ad8a6983 100644 --- a/include/linux/topology.h +++ b/include/linux/topology.h @@ -41,6 +41,7 @@ #endif =20 int arch_update_cpu_topology(void); +void srat_detect_node(struct cpuinfo_x86 *c); =20 /* Conform to ACPI 2.0 SLIT distance definitions */ #define LOCAL_DISTANCE 10 --=20 2.53.0