From nobody Mon Apr 6 19:59:51 2026 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 57AD537F723; Wed, 18 Mar 2026 08:08:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=193.142.43.55 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773821327; cv=none; b=Aoi70VZisXtVeSOepND4FlyZUs7tNEq5c8kM9riFZ+wCQmiT0MfzNHhH78NAi1OX2kZ9fefh9Mrkd7+xargA7zpeopC/ZHRfwtoTH3hfuJktrkORXV4VRBF5nyCeGEISjIPHywDJxezyOx9IiQ9thyDdEAHbZK8eXLUxZxs52i4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773821327; c=relaxed/simple; bh=AFFigPcJsTFgdBDtOciHNyRxTjG2wpZrwijRzrBvg3Y=; h=Date:From:To:Subject:Cc:In-Reply-To:References:MIME-Version: Message-ID:Content-Type; b=NIXhOy63ntJDYeUHm6/UWHE6rEsW8FZ9TLQ1HP8EX3GXARCjI4D+M/pDXxmXzU/bVsJFcN6VvB1DC0uZ6VfmTCPCqVkQo/UQicLVlKl4/4hhAflKV2JtH8dHDkY+Ojw4pfmTlDY/mBwHsfor//opBVZtam9/Zp+wqX/lqmYnWg4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de; spf=pass smtp.mailfrom=linutronix.de; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=cdByHMxN; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=d+5ZKrik; arc=none smtp.client-ip=193.142.43.55 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="cdByHMxN"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="d+5ZKrik" Date: Wed, 18 Mar 2026 08:08:43 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1773821324; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0WjkQhvt+DPh+wwXrCweRKkrLnHt3L6oPjuhjq+zJhI=; b=cdByHMxNOoyzMiuapvrlWzs9b7C1hKI0BQja3GgcF2m2Qo61++bCjc1PoQFxqrWisceF96 vCUdqh/GAAIh+rVM3WqDsVCwghB5B04ovdfgtMFHmr8vOAuPHLTkKwcBh9c0ni4jQOOuTG hcj/bVEoJIX23Lz1AjY7jEwQB4d0mVVS+rGtLoAb4puD3stoWIjOWhEJuFe/lwBGRKmfOP 11amNRF4njsgL0rXKq2/0rDmo9waBUbuWRhsfeUl1jZUj7xlipNOPn1XxrPdAdwySUvplK zyie5Y3RCqjH0i+lgrDcj7bMcIzp/2NfyW9shXe6QByYHQxo75E0dnUJZmqdcw== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1773821324; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0WjkQhvt+DPh+wwXrCweRKkrLnHt3L6oPjuhjq+zJhI=; b=d+5ZKrikDQx40839AGKiFtB2or3xcdKyhL4l5vXa9y1lof4DAfcIor5UKASLk4WeHGR/HC hP5aricpyDOyXWBQ== From: "tip-bot2 for K Prateek Nayak" Sender: tip-bot2@linutronix.de Reply-to: linux-kernel@vger.kernel.org To: linux-tip-commits@vger.kernel.org Subject: [tip: sched/core] sched/topology: Extract "imb_numa_nr" calculation into a separate helper Cc: Valentin Schneider , K Prateek Nayak , "Peter Zijlstra (Intel)" , Dietmar Eggemann , x86@kernel.org, linux-kernel@vger.kernel.org In-Reply-To: <20260312044434.1974-3-kprateek.nayak@amd.com> References: <20260312044434.1974-3-kprateek.nayak@amd.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-ID: <177382132322.1647592.2119369279257157888.tip-bot2@tip-bot2> Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails Precedence: bulk Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable The following commit has been merged into the sched/core branch of tip: Commit-ID: 5a7b576b3ec1acc2694c5b58f80cd1d44a11b2c1 Gitweb: https://git.kernel.org/tip/5a7b576b3ec1acc2694c5b58f80cd1d44= a11b2c1 Author: K Prateek Nayak AuthorDate: Thu, 12 Mar 2026 04:44:27=20 Committer: Peter Zijlstra CommitterDate: Wed, 18 Mar 2026 09:06:48 +01:00 sched/topology: Extract "imb_numa_nr" calculation into a separate helper Subsequent changes to assign "sd->shared" from "s_data" would necessitate finding the topmost SD_SHARE_LLC to assign shared object to. This is very similar to the "imb_numa_nr" computation loop except that "imb_numa_nr" cares about the first domain without the SD_SHARE_LLC flag (immediate parent of sd_llc) whereas the "sd->shared" assignment would require sd_llc itself. Extract the "imb_numa_nr" calculation into a helper adjust_numa_imbalance() and use the current loop in the build_sched_domains() to find the sd_llc. While at it, guard the call behind CONFIG_NUMA's status since "imb_numa_nr" only makes sense on NUMA enabled configs with SD_NUMA domains. No functional changes intended. Suggested-by: Valentin Schneider Signed-off-by: K Prateek Nayak Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Dietmar Eggemann Tested-by: Dietmar Eggemann Link: https://patch.msgid.link/20260312044434.1974-3-kprateek.nayak@amd.com --- kernel/sched/topology.c | 133 +++++++++++++++++++++++---------------- 1 file changed, 80 insertions(+), 53 deletions(-) diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c index 79bab80..6303790 100644 --- a/kernel/sched/topology.c +++ b/kernel/sched/topology.c @@ -2550,6 +2550,74 @@ static bool topology_span_sane(const struct cpumask = *cpu_map) } =20 /* + * Calculate an allowed NUMA imbalance such that LLCs do not get + * imbalanced. + */ +static void adjust_numa_imbalance(struct sched_domain *sd_llc) +{ + struct sched_domain *parent; + unsigned int imb_span =3D 1; + unsigned int imb =3D 0; + unsigned int nr_llcs; + + WARN_ON(!(sd_llc->flags & SD_SHARE_LLC)); + WARN_ON(!sd_llc->parent); + + /* + * For a single LLC per node, allow an + * imbalance up to 12.5% of the node. This is + * arbitrary cutoff based two factors -- SMT and + * memory channels. For SMT-2, the intent is to + * avoid premature sharing of HT resources but + * SMT-4 or SMT-8 *may* benefit from a different + * cutoff. For memory channels, this is a very + * rough estimate of how many channels may be + * active and is based on recent CPUs with + * many cores. + * + * For multiple LLCs, allow an imbalance + * until multiple tasks would share an LLC + * on one node while LLCs on another node + * remain idle. This assumes that there are + * enough logical CPUs per LLC to avoid SMT + * factors and that there is a correlation + * between LLCs and memory channels. + */ + nr_llcs =3D sd_llc->parent->span_weight / sd_llc->span_weight; + if (nr_llcs =3D=3D 1) + imb =3D sd_llc->parent->span_weight >> 3; + else + imb =3D nr_llcs; + + imb =3D max(1U, imb); + sd_llc->parent->imb_numa_nr =3D imb; + + /* + * Set span based on the first NUMA domain. + * + * NUMA systems always add a NODE domain before + * iterating the NUMA domains. Since this is before + * degeneration, start from sd_llc's parent's + * parent which is the lowest an SD_NUMA domain can + * be relative to sd_llc. + */ + parent =3D sd_llc->parent->parent; + while (parent && !(parent->flags & SD_NUMA)) + parent =3D parent->parent; + + imb_span =3D parent ? parent->span_weight : sd_llc->parent->span_weight; + + /* Update the upper remainder of the topology */ + parent =3D sd_llc->parent; + while (parent) { + int factor =3D max(1U, (parent->span_weight / imb_span)); + + parent->imb_numa_nr =3D imb * factor; + parent =3D parent->parent; + } +} + +/* * Build sched domains for a given set of CPUs and attach the sched domains * to the individual CPUs */ @@ -2606,62 +2674,21 @@ build_sched_domains(const struct cpumask *cpu_map, = struct sched_domain_attr *att } } =20 - /* - * Calculate an allowed NUMA imbalance such that LLCs do not get - * imbalanced. - */ for_each_cpu(i, cpu_map) { - unsigned int imb =3D 0; - unsigned int imb_span =3D 1; + sd =3D *per_cpu_ptr(d.sd, i); + if (!sd) + continue; =20 - for (sd =3D *per_cpu_ptr(d.sd, i); sd; sd =3D sd->parent) { - struct sched_domain *child =3D sd->child; - - if (!(sd->flags & SD_SHARE_LLC) && child && - (child->flags & SD_SHARE_LLC)) { - struct sched_domain __rcu *top_p; - unsigned int nr_llcs; - - /* - * For a single LLC per node, allow an - * imbalance up to 12.5% of the node. This is - * arbitrary cutoff based two factors -- SMT and - * memory channels. For SMT-2, the intent is to - * avoid premature sharing of HT resources but - * SMT-4 or SMT-8 *may* benefit from a different - * cutoff. For memory channels, this is a very - * rough estimate of how many channels may be - * active and is based on recent CPUs with - * many cores. - * - * For multiple LLCs, allow an imbalance - * until multiple tasks would share an LLC - * on one node while LLCs on another node - * remain idle. This assumes that there are - * enough logical CPUs per LLC to avoid SMT - * factors and that there is a correlation - * between LLCs and memory channels. - */ - nr_llcs =3D sd->span_weight / child->span_weight; - if (nr_llcs =3D=3D 1) - imb =3D sd->span_weight >> 3; - else - imb =3D nr_llcs; - imb =3D max(1U, imb); - sd->imb_numa_nr =3D imb; - - /* Set span based on the first NUMA domain. */ - top_p =3D sd->parent; - while (top_p && !(top_p->flags & SD_NUMA)) { - top_p =3D top_p->parent; - } - imb_span =3D top_p ? top_p->span_weight : sd->span_weight; - } else { - int factor =3D max(1U, (sd->span_weight / imb_span)); + /* First, find the topmost SD_SHARE_LLC domain */ + while (sd->parent && (sd->parent->flags & SD_SHARE_LLC)) + sd =3D sd->parent; =20 - sd->imb_numa_nr =3D imb * factor; - } - } + /* + * In presence of higher domains, adjust the + * NUMA imbalance stats for the hierarchy. + */ + if (IS_ENABLED(CONFIG_NUMA) && (sd->flags & SD_SHARE_LLC) && sd->parent) + adjust_numa_imbalance(sd); } =20 /* Calculate CPU capacity for physical packages and nodes */