From nobody Tue Apr 7 20:08:12 2026 Received: from SJ2PR03CU001.outbound.protection.outlook.com (mail-westusazon11012000.outbound.protection.outlook.com [52.101.43.0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D0A9C282F12 for ; Thu, 12 Mar 2026 04:45:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=52.101.43.0 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773290736; cv=fail; b=XMutXD9oLu0JleUV4Q6ZWsuLSONqKPzZNvlvD+0VS+ZkXByZFQ1Qwzcferu/P6zQUl7ggCEUxcSQp2HbIX6z8/sYE2uAxmSto0EUrbYmFoL5cF4qymOe3YHbnnBa7TIZMGzuMA7Fh5fqupGB8tzbDy2B3u+bgvJ9OlOCtB/8g1M= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773290736; c=relaxed/simple; bh=xJ5r/u4BB4f+xzk4uQcyKW/cFENKg4+ZqPOWFwTGMXo=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=VLpLhE4wIkFomCdSili3duJ1UexAvbCq6oBtJ3adluhlx7QUQWyMDL9xA/txYYCSjGrh2v9IFrYcJNhMB6r3zwClL3WSID7zvEhrdWVuKtbBgP6iekOrdxetsZN9rM/RoI9aAgQ+ZcQTZWIpE1qMqGvxEhQ4EAfqo2KB+jQckYM= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com; spf=fail smtp.mailfrom=amd.com; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b=ZymWcLTh; arc=fail smtp.client-ip=52.101.43.0 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=amd.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b="ZymWcLTh" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=Obt3XPr8Vky50i0X+ab27Nin1avk82AFZeIExGHDONM0zYpibsVLysGFqi438Rq0f/je+9yZR0K57Wh7k8WatLvuvTXUsIi1rv0AHcMeSWpqQdb5g3m6RhITHge3vmCZB7BALIdrj1LyP1fH9A8Duo8CG++HFk5JGKTN4A7gEq3RHkpZZt0D9cNC6FgmEocONAShvFH4OtckNtG2erzW0MrRBOfFSP6axNbbxS2gBStpMLBefHDNdXGsWH3IE2g+4w3X0lHOHqhJg9hZPhtPSgrPlQWZaGIhnVBPr0FaY0MTocDjYdR0pMJNktoP9OwFusJnev0ED+YkYQvb/gdSsQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=sndsr/nWZsBT9PmROpdsCtyS8Ibr9qDOo4WaIKJCvXk=; b=wfD2aMPnaBxmnHomWVR2m4YGU+CSbACTVmjTvjNUxpjLOIyK2gsG3F/CWkjkBEshvUaCsWzuH/TKLQW+wMMABhm+c6COJGQm0QhsLHmVp7HWFYCXKtUcBvMMnwjj514u0ptpuGC2D/nwwOL3QgjYcKSBXZzuDNXysjvvQMTxGHRSiBx0xTnAKKEZmtWH7AesUXreevp+ypJkPuCh8PR62MAVE0Y1UAdnfIuY1rBwFTzXE8URr4M6OEXDKx9wUTc34cY+1ni4SMVGdR4Ji76RTWC23MNv7iyDpBsXdoj1Ubf3j6SI94DP/9sxCBaLB3Ilio0rgo+m5iU9BcsRD7Qxhw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=redhat.com smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=sndsr/nWZsBT9PmROpdsCtyS8Ibr9qDOo4WaIKJCvXk=; b=ZymWcLTh7uzTmEZ4O8EcDvVFawoXRry+eBywRFTq0f5q6JomlSx5Tjr0DkBvrU+WhEW0ko1P+NuJ5Lg0AFVZHkEhCByPWUHyScgqtv+h9knr5n95GH/hNpbgGG8D197I1IMwmkg+J2042GXecxmNDX2PNqhqiBP9YkxdMVSRCx8= Received: from BY3PR05CA0023.namprd05.prod.outlook.com (2603:10b6:a03:254::28) by IA1PR12MB8238.namprd12.prod.outlook.com (2603:10b6:208:3f9::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9723.4; Thu, 12 Mar 2026 04:45:28 +0000 Received: from MWH0EPF000C6188.namprd02.prod.outlook.com (2603:10b6:a03:254:cafe::1b) by BY3PR05CA0023.outlook.office365.com (2603:10b6:a03:254::28) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.9700.15 via Frontend Transport; Thu, 12 Mar 2026 04:45:28 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=satlexmb07.amd.com; pr=C Received: from satlexmb07.amd.com (165.204.84.17) by MWH0EPF000C6188.mail.protection.outlook.com (10.167.249.120) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9678.18 via Frontend Transport; Thu, 12 Mar 2026 04:45:27 +0000 Received: from BLRKPRNAYAK.amd.com (10.180.168.240) by satlexmb07.amd.com (10.181.42.216) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.2562.17; Wed, 11 Mar 2026 23:45:22 -0500 From: K Prateek Nayak To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Valentin Schneider , CC: Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Chen Yu , Shrikanth Hegde , Li Chen , "Gautham R. Shenoy" , K Prateek Nayak Subject: [PATCH v4 2/9] sched/topology: Extract "imb_numa_nr" calculation into a separate helper Date: Thu, 12 Mar 2026 04:44:27 +0000 Message-ID: <20260312044434.1974-3-kprateek.nayak@amd.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20260312044434.1974-1-kprateek.nayak@amd.com> References: <20260312044434.1974-1-kprateek.nayak@amd.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: satlexmb07.amd.com (10.181.42.216) To satlexmb07.amd.com (10.181.42.216) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: MWH0EPF000C6188:EE_|IA1PR12MB8238:EE_ X-MS-Office365-Filtering-Correlation-Id: 3d106c8a-68f3-42c6-fa02-08de7ff22ed5 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|82310400026|1800799024|36860700016|7416014|376014|13003099007|18002099003|22082099003|56012099003; X-Microsoft-Antispam-Message-Info: 42odSlB68YQ1sEQ4UA9kEJ5ffy4xrqjEzyzyGG1rPPRy42Ot4WdspAnYxLQJQNd/jBbwah/vjYWmlv46fsPGU+wwJE97/IKYtr1e+KbnpIyuP2sHyvFoCfPSYd51Z/oVxdgSYzqY6dWE6POPS2cfAqMZDfQgXypUAdQJgrXO/jv2YHmyULE8vXAlSjx1DZEo82s+4vDSuiwHxovpNTvr5VevWKNwHR7XDoC3CYrjdxozz36WtBnNmrqBPlaRDleN/DALBICrdA4HVwcWjcG5KVZpZyUy6NF13/rNXua3sVG9UG1dmU+Fx3DTpuCE3Em2flmSYsmqEAvaGM/QVZMrEN0ICiomn4aKm4B0so7OnLufEP3Ao254nFGQeTWyvtnXylerjwRAnhCs2s7FLvymxB0WI2m2w5UHlACY5GusCf7Hwyrxu1qxNSHR6PpoGaCdt8AAlvFrqrkeCjZGhFAp0Y2tIK5glaBQTYSAMFlVbYMErJ5s3M8f3v7wN4f+q6G9NEfJn/TOAZIjlMn72jmfij6pBTL2RjHaSNAGIPVaG1nTu+MUa/mpmME5FsnTPnofjwVGe5d/DEF/fU2+9ACIcL1312KTYUmhiTFv4PBM7mEdF/TDncN3Bbv5hUX8dvCBpxUuROmZTrNehn7OdqPaFe9ebtopenL8kJyOMJJ+jbbvZff9T+Z4L4iRxd6kpJ71JSM16HiFwgTuy+bQS76JotcrEIpYL9hBArbCnkGj9i8= X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:satlexmb07.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230040)(82310400026)(1800799024)(36860700016)(7416014)(376014)(13003099007)(18002099003)(22082099003)(56012099003);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: QMDEJERan+TqD6K4RI9gxu9orprI6XjhAWJg11BGNwY0G757XZTTp22OjHL8V2EWg8b3IpWrlP5sOxxZlFfXPWJLuH/Iawoe9pgBFLBqtF5Xwo91oAkJxxr12bJ4I89cKazxJs3vn5atSavS/O4No7ui/B8gB61gp515XOlosAIzY8iMLWyL+mNfgamcXXHBLzyFsgk4sVxAE2JPIePa6VvLfLk8kggJykNboTUtDTxNa3FxgF+TZEorfmAQX6RT7gfrILQY7oasKlVd5xJNBFBaiq5Y1CuomHessikFgFOKNRnZzRj22BTOsc9p5E7tKO0A49WJUdUBlMSkhtEo/+5UwPbkOVPJ/jZNCqYXYkPKv+dJz2+2n0g7FSDjDp4TSJvkE2y6OwPd7U6u0YI+2DtE+WLzZm7i8U6T+LDOor3aJSuqO4wTliBueQw1QqwG X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Mar 2026 04:45:27.6961 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 3d106c8a-68f3-42c6-fa02-08de7ff22ed5 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[satlexmb07.amd.com] X-MS-Exchange-CrossTenant-AuthSource: MWH0EPF000C6188.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: IA1PR12MB8238 Content-Type: text/plain; charset="utf-8" Subsequent changes to assign "sd->shared" from "s_data" would necessitate finding the topmost SD_SHARE_LLC to assign shared object to. This is very similar to the "imb_numa_nr" computation loop except that "imb_numa_nr" cares about the first domain without the SD_SHARE_LLC flag (immediate parent of sd_llc) whereas the "sd->shared" assignment would require sd_llc itself. Extract the "imb_numa_nr" calculation into a helper adjust_numa_imbalance() and use the current loop in the build_sched_domains() to find the sd_llc. While at it, guard the call behind CONFIG_NUMA's status since "imb_numa_nr" only makes sense on NUMA enabled configs with SD_NUMA domains. No functional changes intended. Suggested-by: Valentin Schneider Signed-off-by: K Prateek Nayak --- Changelog v3..v4: o New patch based on the suggestion from Valentin and Chenyu in https://lore.kernel.org/lkml/xhsmh343e43fd.mognet@vschneid-thinkpadt14sge= n2i.remote.csb/ Notable deviation is moving the entire "imb_numa_nr" loop into the adjust_numa_imbalance() helper to keep all the bits in one place instead of passing "imb" and "imb_span" as references to the helper. o Guarded the call behind CONFIG_NUMA's status to save overhead when NUMA domains don't exist. --- kernel/sched/topology.c | 133 ++++++++++++++++++++++++---------------- 1 file changed, 80 insertions(+), 53 deletions(-) diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c index 34b20b0e1867..7f25c784c038 100644 --- a/kernel/sched/topology.c +++ b/kernel/sched/topology.c @@ -2551,6 +2551,74 @@ static bool topology_span_sane(const struct cpumask = *cpu_map) return true; } =20 +/* + * Calculate an allowed NUMA imbalance such that LLCs do not get + * imbalanced. + */ +static void adjust_numa_imbalance(struct sched_domain *sd_llc) +{ + struct sched_domain *parent; + unsigned int imb_span =3D 1; + unsigned int imb =3D 0; + unsigned int nr_llcs; + + WARN_ON(!(sd_llc->flags & SD_SHARE_LLC)); + WARN_ON(!sd_llc->parent); + + /* + * For a single LLC per node, allow an + * imbalance up to 12.5% of the node. This is + * arbitrary cutoff based two factors -- SMT and + * memory channels. For SMT-2, the intent is to + * avoid premature sharing of HT resources but + * SMT-4 or SMT-8 *may* benefit from a different + * cutoff. For memory channels, this is a very + * rough estimate of how many channels may be + * active and is based on recent CPUs with + * many cores. + * + * For multiple LLCs, allow an imbalance + * until multiple tasks would share an LLC + * on one node while LLCs on another node + * remain idle. This assumes that there are + * enough logical CPUs per LLC to avoid SMT + * factors and that there is a correlation + * between LLCs and memory channels. + */ + nr_llcs =3D sd_llc->parent->span_weight / sd_llc->span_weight; + if (nr_llcs =3D=3D 1) + imb =3D sd_llc->parent->span_weight >> 3; + else + imb =3D nr_llcs; + + imb =3D max(1U, imb); + sd_llc->parent->imb_numa_nr =3D imb; + + /* + * Set span based on the first NUMA domain. + * + * NUMA systems always add a NODE domain before + * iterating the NUMA domains. Since this is before + * degeneration, start from sd_llc's parent's + * parent which is the lowest an SD_NUMA domain can + * be relative to sd_llc. + */ + parent =3D sd_llc->parent->parent; + while (parent && !(parent->flags & SD_NUMA)) + parent =3D parent->parent; + + imb_span =3D parent ? parent->span_weight : sd_llc->parent->span_weight; + + /* Update the upper remainder of the topology */ + parent =3D sd_llc->parent; + while (parent) { + int factor =3D max(1U, (parent->span_weight / imb_span)); + + parent->imb_numa_nr =3D imb * factor; + parent =3D parent->parent; + } +} + /* * Build sched domains for a given set of CPUs and attach the sched domains * to the individual CPUs @@ -2608,62 +2676,21 @@ build_sched_domains(const struct cpumask *cpu_map, = struct sched_domain_attr *att } } =20 - /* - * Calculate an allowed NUMA imbalance such that LLCs do not get - * imbalanced. - */ for_each_cpu(i, cpu_map) { - unsigned int imb =3D 0; - unsigned int imb_span =3D 1; + sd =3D *per_cpu_ptr(d.sd, i); + if (!sd) + continue; =20 - for (sd =3D *per_cpu_ptr(d.sd, i); sd; sd =3D sd->parent) { - struct sched_domain *child =3D sd->child; - - if (!(sd->flags & SD_SHARE_LLC) && child && - (child->flags & SD_SHARE_LLC)) { - struct sched_domain __rcu *top_p; - unsigned int nr_llcs; - - /* - * For a single LLC per node, allow an - * imbalance up to 12.5% of the node. This is - * arbitrary cutoff based two factors -- SMT and - * memory channels. For SMT-2, the intent is to - * avoid premature sharing of HT resources but - * SMT-4 or SMT-8 *may* benefit from a different - * cutoff. For memory channels, this is a very - * rough estimate of how many channels may be - * active and is based on recent CPUs with - * many cores. - * - * For multiple LLCs, allow an imbalance - * until multiple tasks would share an LLC - * on one node while LLCs on another node - * remain idle. This assumes that there are - * enough logical CPUs per LLC to avoid SMT - * factors and that there is a correlation - * between LLCs and memory channels. - */ - nr_llcs =3D sd->span_weight / child->span_weight; - if (nr_llcs =3D=3D 1) - imb =3D sd->span_weight >> 3; - else - imb =3D nr_llcs; - imb =3D max(1U, imb); - sd->imb_numa_nr =3D imb; - - /* Set span based on the first NUMA domain. */ - top_p =3D sd->parent; - while (top_p && !(top_p->flags & SD_NUMA)) { - top_p =3D top_p->parent; - } - imb_span =3D top_p ? top_p->span_weight : sd->span_weight; - } else { - int factor =3D max(1U, (sd->span_weight / imb_span)); + /* First, find the topmost SD_SHARE_LLC domain */ + while (sd->parent && (sd->parent->flags & SD_SHARE_LLC)) + sd =3D sd->parent; =20 - sd->imb_numa_nr =3D imb * factor; - } - } + /* + * In presence of higher domains, adjust the + * NUMA imbalance stats for the hierarchy. + */ + if (IS_ENABLED(CONFIG_NUMA) && (sd->flags & SD_SHARE_LLC) && sd->parent) + adjust_numa_imbalance(sd); } =20 /* Calculate CPU capacity for physical packages and nodes */ --=20 2.34.1