From nobody Sat Feb 7 06:39:48 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D8F2FC77B7A for ; Wed, 31 May 2023 12:04:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235812AbjEaMEy (ORCPT ); Wed, 31 May 2023 08:04:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55020 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234649AbjEaMEu (ORCPT ); Wed, 31 May 2023 08:04:50 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 63693101; Wed, 31 May 2023 05:04:49 -0700 (PDT) Date: Wed, 31 May 2023 12:04:47 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1685534688; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=mkFR1p6NGaVWzqWo+3Ov+dWA/Fp3KYe2OtV0WYPqN04=; b=nBrT/TPqM41GvLPhCmDpjc/a0IxKU0UeTYMWjU948RcZoLU5VeK0HkhuFoEWoRi6ItlU5z 92RxOwRSlnkTFOgUd39L0oFWoXs4VEol7Efu6Fqig/WU3Ou5WWpks+CKly26pULggh09CJ JQgv0/UdecmscS3j062BJkeV2x510OHfhKmu5XepB9WYiPpDkf38c3TKlal3LDEgtJLTCw Lr242c3S06MMhlmKTvlNBMyHXAkx7msg+0Ln8iGIGlJrY+4FDs2bNHsLwcW6R3UHolKGKI tqySc3PInnxZ5X8b3HFdiE0joUyhV7kpjSuZUZhgSZNRv+lAP927xHnfpV4C5A== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1685534688; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=mkFR1p6NGaVWzqWo+3Ov+dWA/Fp3KYe2OtV0WYPqN04=; b=hGCRxsGF8qfCyZv5UuIc7qLThGLIDp9AuPgGt8kuS87j5yf/GlHrhHY7/vDHqWJXLgHEdp eQJonsaWqtdDodCQ== From: "tip-bot2 for Peter Zijlstra" Sender: tip-bot2@linutronix.de Reply-to: linux-kernel@vger.kernel.org To: linux-tip-commits@vger.kernel.org Subject: [tip: sched/core] sched/fair: Multi-LLC select_idle_sibling() Cc: Tejun Heo , "Peter Zijlstra (Intel)" , x86@kernel.org, linux-kernel@vger.kernel.org MIME-Version: 1.0 Message-ID: <168553468754.404.2298362895524875073.tip-bot2@tip-bot2> Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The following commit has been merged into the sched/core branch of tip: Commit-ID: c7dfd6b9122d29d0e9a4587ab470c0564d7f92ab Gitweb: https://git.kernel.org/tip/c7dfd6b9122d29d0e9a4587ab470c0564= d7f92ab Author: Peter Zijlstra AuthorDate: Tue, 30 May 2023 13:20:46 +02:00 Committer: Peter Zijlstra CommitterDate: Tue, 30 May 2023 22:46:27 +02:00 sched/fair: Multi-LLC select_idle_sibling() Tejun reported that when he targets workqueues towards a specific LLC on his Zen2 machine with 3 cores / LLC and 4 LLCs in total, he gets significant idle time. This is, of course, because of how select_idle_sibling() will not consider anything outside of the local LLC, and since all these tasks are short running the periodic idle load balancer is ineffective. And while it is good to keep work cache local, it is better to not have significant idle time. Therefore, have select_idle_sibling() try other LLCs inside the same node when the local one comes up empty. Reported-by: Tejun Heo Signed-off-by: Peter Zijlstra (Intel) Tested-by: Chen Yu --- kernel/sched/fair.c | 38 ++++++++++++++++++++++++++++++++++++++ kernel/sched/features.h | 1 + 2 files changed, 39 insertions(+) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 48b6f0c..0172458 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -7028,6 +7028,38 @@ static int select_idle_cpu(struct task_struct *p, st= ruct sched_domain *sd, bool=20 } =20 /* + * For the multiple-LLC per node case, make sure to try the other LLC's if= the + * local LLC comes up empty. + */ +static int +select_idle_node(struct task_struct *p, struct sched_domain *sd, int targe= t) +{ + struct sched_domain *parent =3D sd->parent; + struct sched_group *sg; + + /* Make sure to not cross nodes. */ + if (!parent || parent->flags & SD_NUMA) + return -1; + + sg =3D parent->groups; + do { + int cpu =3D cpumask_first(sched_group_span(sg)); + struct sched_domain *sd_child; + + sd_child =3D per_cpu(sd_llc, cpu); + if (sd_child !=3D sd) { + int i =3D select_idle_cpu(p, sd_child, test_idle_cores(cpu), cpu); + if ((unsigned)i < nr_cpumask_bits) + return i; + } + + sg =3D sg->next; + } while (sg !=3D parent->groups); + + return -1; +} + +/* * Scan the asym_capacity domain for idle CPUs; pick the first idle one on= which * the task fits. If no CPU is big enough, but there are idle ones, try to * maximize capacity. @@ -7199,6 +7231,12 @@ static int select_idle_sibling(struct task_struct *p= , int prev, int target) if ((unsigned)i < nr_cpumask_bits) return i; =20 + if (sched_feat(SIS_NODE)) { + i =3D select_idle_node(p, sd, target); + if ((unsigned)i < nr_cpumask_bits) + return i; + } + return target; } =20 diff --git a/kernel/sched/features.h b/kernel/sched/features.h index ee7f23c..9e390eb 100644 --- a/kernel/sched/features.h +++ b/kernel/sched/features.h @@ -62,6 +62,7 @@ SCHED_FEAT(TTWU_QUEUE, true) */ SCHED_FEAT(SIS_PROP, false) SCHED_FEAT(SIS_UTIL, true) +SCHED_FEAT(SIS_NODE, true) =20 /* * Issue a WARN when we do multiple update_rq_clock() calls