From nobody Sat Feb 7 07:25:32 2026 Received: from mx0b-002e3701.pphosted.com (mx0b-002e3701.pphosted.com [148.163.143.35]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5D7D71AC44D for ; Thu, 5 Feb 2026 00:25:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=148.163.143.35 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770251107; cv=none; b=sQJMf6mFKT9pNf4kU4jzEBIQ8Spbc55YtnFkd8LABuvxn4oYolUItS2I8AiWIEG5aEJbXchgH1zi0EWdThrJPDDz0IbcGT6rCOuuuPaNaUWpem8D3mlwXAx3MkqBOUaZ+ny5CBNESmwRTzlexg9khC8BFoHq5VxLr9mLs3yUjQQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770251107; c=relaxed/simple; bh=79aCUK+KtUT0mnGQ88tp+2vS9wanV6+TmUSER62sPOY=; h=Date:From:To:Cc:Subject:Message-ID:MIME-Version:Content-Type: Content-Disposition; b=TiAhTz1zxggtmEbUoJJDuWtMrr22xS1XSI4CAaxpg1RPYfecEdHj/fUd+mkDwhfpTeug+DkFn3fJu9eMD32EcwBEGa1srwvCXetYuAFk1e90nmlwdD5BEH0EKZ5GTX4sfyFjsj/YXXOGY2M5tU2eY5l9fdLittWM8TT0sVSohKQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=hpe.com; spf=pass smtp.mailfrom=hpe.com; dkim=pass (2048-bit key) header.d=hpe.com header.i=@hpe.com header.b=GA6bcbXM; arc=none smtp.client-ip=148.163.143.35 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=hpe.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=hpe.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=hpe.com header.i=@hpe.com header.b="GA6bcbXM" Received: from pps.filterd (m0148664.ppops.net [127.0.0.1]) by mx0b-002e3701.pphosted.com (8.18.1.11/8.18.1.11) with ESMTP id 61502TxG4059638; Thu, 5 Feb 2026 00:24:31 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hpe.com; h=cc :content-type:date:from:message-id:mime-version:subject:to; s= pps0720; bh=TwZSsw0uAftBHEP1ClP++Ipm7auIo6qqNAh7uxUyXl4=; b=GA6b cbXMrSgP6b7hua6GhckW7/epiV04MNLGKjFW9oAKG5L/WuCkHM7fH3NPwHjYQuTV TTer76jKHyjHwzJjPtIUdxNN9j6qWkmqk+AZSUAvffS1klzITUdxehy4Ncndw8Fb mvluwNuZieJCXiDPxggOkXCjV/01PdPDkzG2HKxMLfs6MgALrbhCpFRlB1TJvPum qIztnLYzrMZ3hLyMp4RoydLZ94/oXCyuVQkcbnZZsQnAZ7aKqN7qEq73icL4/sl/ FqLp6gowKKl4BIUq1G6ArJEwoL3nHFiMRSDimKp0XD7y+lpb8crQktokEMeElRjr Tt17eFQ5SaK/k1bpMA== Received: from p1lg14880.it.hpe.com (p1lg14880.it.hpe.com [16.230.97.201]) by mx0b-002e3701.pphosted.com (PPS) with ESMTPS id 4c4e8v9c89-1 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Thu, 05 Feb 2026 00:24:30 +0000 (GMT) Received: from p1lg14886.dc01.its.hpecorp.net (unknown [10.119.18.237]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by p1lg14880.it.hpe.com (Postfix) with ESMTPS id 1B75B800EBD; Thu, 5 Feb 2026 00:24:30 +0000 (UTC) Received: from HPE-5CG20646DK.localdomain (unknown [16.231.227.36]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (Client did not present a certificate) by p1lg14886.dc01.its.hpecorp.net (Postfix) with ESMTPS id 7730780FDC0; Thu, 5 Feb 2026 00:24:28 +0000 (UTC) Date: Wed, 4 Feb 2026 18:24:26 -0600 From: Kyle Meyer To: tim.c.chen@linux.intel.com, bp@alien8.de, dave.hansen@linux.intel.com, mingo@redhat.com, peterz@infradead.org, tglx@kernel.org, vinicius.gomes@intel.com Cc: brgerst@gmail.com, hpa@zytor.com, kprateek.nayak@amd.com, linux-kernel@vger.kernel.org, patryk.wlazlyn@linux.intel.com, rafael.j.wysocki@intel.com, russ.anderson@hpe.com, x86@kernel.org, yu.c.chen@intel.com, zhao1.liu@intel.com, kyle.meyer@hpe.com Subject: [PATCH v2] sched/topology: Check average distances to remote packages Message-ID: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline X-Proofpoint-Spam-Details-Enc: AW1haW4tMjYwMjA1MDAwMSBTYWx0ZWRfX3ueUW6oktpd3 fjk4JmerZLUvtzVT143f3wpM4jPS6tGsjtv+eQpsDDsaQomUhrhbTQDJ44Nj0oTtEuxM2mSqDvr fwS3Yz3Te81qXTp4SBZvT3MZnSV0nizxS95kxmjY4rA/cAHfEHFHfJMaoq3TDMhO8vS44QHG8ZF hAVFIrUoNQB78ZZXz0NF2AEmoCusGPImuDmd1Zf3eZD7n31prS4bRCmcvhLJtD+javhqPcA02Gk acyVLt+/HgAl9zUjG1HFmd0qONPYiW/7+5ZYm4SRZwRC8Rbp0MlIuKCQZEW2TNHT5WJ8cW6ZBJ/ aJQ+puBjam1MSVNd/hI/clW5LCCGZODpf0tsjitrK0EoUNlEfP0a0uUrbBwIlPo9eOftzJlE0lL g7SBLhCi2ve8VKu0WRBILUCF+ERWrw6eZ6CNNhCu1qqFi9dLAnN13/I2X4wD7QHUiPgBnKCECJ+ SV3FW4eirRu1YreR76Q== X-Proofpoint-GUID: 3Cw_GfBgI6ExVTwX9uukvNpvTcIBJ_Cm X-Proofpoint-ORIG-GUID: 3Cw_GfBgI6ExVTwX9uukvNpvTcIBJ_Cm X-Authority-Analysis: v=2.4 cv=KbPfcAYD c=1 sm=1 tr=0 ts=6983e33e cx=c_pps a=A+SOMQ4XYIH4HgQ50p3F5Q==:117 a=A+SOMQ4XYIH4HgQ50p3F5Q==:17 a=kj9zAlcOel0A:10 a=HzLeVaNsDn8A:10 a=VkNPw1HP01LnGYTKEx00:22 a=VwQbUJbxAAAA:8 a=MvuuwTCpAAAA:8 a=QyXUC8HyAAAA:8 a=B2To48ILbIghOaZEGuQA:9 a=CjuIK1q_8ugA:10 X-HPE-SCL: -1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1121,Hydra:6.1.51,FMLib:17.12.100.49 definitions=2026-02-04_08,2026-02-04_01,2025-10-01_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 phishscore=0 adultscore=0 priorityscore=1501 malwarescore=0 suspectscore=0 lowpriorityscore=0 bulkscore=0 impostorscore=0 spamscore=0 classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.22.0-2601150000 definitions=main-2602050001 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Granite Rapids (GNR) and Clearwater Forest (CWF) average distances to remote packages to fix scheduler domains, see [1] for more information. A warning and backtrace are printed when sub-NUMA clustering (SNC) is enabled and there are more than 2 packages because the average distances to remote packages could be different, skewing the single average remote distance. This is unnecessary when the average distances to remote packages are the same. Support single average remote distance on systems with more than 2 packages, preventing unnecessary warnings and backtraces, by checking if average distances to remote packages are the same. [1] commit 4d6dd05d07d0 ("sched/topology: Fix sched domain build error for = GNR, CWF in SNC-3 mode"). Reviewed-by: Tim Chen Signed-off-by: Kyle Meyer --- The warning and backtrace were noticed on a 16 socket GNR system with SNC-2= enabled. v1: * https://lore.kernel.org/all/aXjvLjTCRe8d3UFD@hpe.com/ v1 -> v2: * Initialize pkg_total_distance and pkg_nr_remote to NULL, as suggested by = Tim. --- arch/x86/kernel/smpboot.c | 69 ++++++++++++++++++++++++++++----------- 1 file changed, 50 insertions(+), 19 deletions(-) diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index 5cd6950ab672..dc8f15bd2e19 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -518,27 +518,69 @@ static int avg_remote_numa_distance(void) { int i, j; int distance, nr_remote, total_distance; + int max_pkgs =3D topology_max_packages(); + int cpu, pkg, pkg_avg_distance; + int *pkg_total_distance =3D NULL, *pkg_nr_remote =3D NULL; =20 if (sched_avg_remote_distance > 0) return sched_avg_remote_distance; =20 + sched_avg_remote_distance =3D REMOTE_DISTANCE; + nr_remote =3D 0; total_distance =3D 0; + + pkg_total_distance =3D kcalloc(max_pkgs, sizeof(int), GFP_KERNEL); + if (!pkg_total_distance) + goto cleanup; + + pkg_nr_remote =3D kcalloc(max_pkgs, sizeof(int), GFP_KERNEL); + if (!pkg_nr_remote) + goto cleanup; + for_each_node_state(i, N_CPU) { for_each_node_state(j, N_CPU) { distance =3D node_distance(i, j); =20 - if (distance >=3D REMOTE_DISTANCE) { - nr_remote++; - total_distance +=3D distance; - } + if (distance < REMOTE_DISTANCE) + continue; + + nr_remote++; + total_distance +=3D distance; + + cpu =3D cpumask_first(cpumask_of_node(j)); + if (cpu >=3D nr_cpu_ids) + continue; + + pkg =3D topology_physical_package_id(cpu); + pkg_total_distance[pkg] +=3D distance; + pkg_nr_remote[pkg]++; } } - if (nr_remote) - sched_avg_remote_distance =3D total_distance / nr_remote; - else - sched_avg_remote_distance =3D REMOTE_DISTANCE; =20 + if (!nr_remote) + goto cleanup; + + sched_avg_remote_distance =3D total_distance / nr_remote; + + /* + * Single average remote distance won't be appropriate if different + * packages have different distances to remote packages. + */ + for (i =3D 0; i < max_pkgs; i++) { + if (!pkg_nr_remote[i]) + continue; + + pkg_avg_distance =3D pkg_total_distance[i] / pkg_nr_remote[i]; + + pr_debug("sched: Avg. distance to remote package %d: %d\n", i, pkg_avg_d= istance); + + if (pkg_avg_distance !=3D sched_avg_remote_distance) + WARN_ONCE(1, "sched: Avg. distances to remote packages are different\n"= ); + } +cleanup: + kfree(pkg_nr_remote); + kfree(pkg_total_distance); return sched_avg_remote_distance; } =20 @@ -564,18 +606,7 @@ int arch_sched_node_distance(int from, int to) * in the remote package in the same sched group. * Simplify NUMA domains and avoid extra NUMA levels including * different remote NUMA nodes and local nodes. - * - * GNR and CWF don't expect systems with more than 2 packages - * and more than 2 hops between packages. Single average remote - * distance won't be appropriate if there are more than 2 - * packages as average distance to different remote packages - * could be different. */ - WARN_ONCE(topology_max_packages() > 2, - "sched: Expect only up to 2 packages for GNR or CWF, " - "but saw %d packages when building sched domains.", - topology_max_packages()); - d =3D avg_remote_numa_distance(); } return d; --=20 2.52.0