From nobody Wed Feb 11 10:38:05 2026
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 19E52EB64DA
	for <linux-kernel@archiver.kernel.org>; Fri,  7 Jul 2023 22:57:08 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S232113AbjGGW47 (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Fri, 7 Jul 2023 18:56:59 -0400
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60484 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S230414AbjGGW4z (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Fri, 7 Jul 2023 18:56:55 -0400
Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 678721999
        for <linux-kernel@vger.kernel.org>;
 Fri,  7 Jul 2023 15:56:54 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple;
  d=intel.com; i=@intel.com; q=dns/txt; s=Intel;
  t=1688770614; x=1720306614;
  h=from:to:cc:subject:date:message-id:in-reply-to:
   references:mime-version:content-transfer-encoding;
  bh=6HBQc6gDrsn14hkdM0BGPFn3U8cheq47FEDyafZnmmY=;
  b=JuKPwUkoPviVvlFnNkKNkLljKSj/CIDCkliHkHeT562IhDA5quZyZ227
   fp7PIC0ZGh3H2r3Fmh5x+Ma1ACcXpMHby6GYOSs1GYztnH8nOq5B5ge+4
   LHzp2GcfnWjanBRSLMzdr2Nbam1QifEtoek7IGBSqeURmAGs4OFTP06RT
   uTH23aIX+69W6AvZqjBWDvDwhC2EdVxRiGuj0vQJrRPQyKI+BiL3BuiOt
   ptLJwNq84gMhaLER9bkRJ7Xl7i3vQay4yHENo0ovc0NzCBdcNzQbk7rPT
   QTWNUpSaS/32565GtWM4m8hxUnSusk/PoqjxJPht468PbjWk+eCDO+hca
   Q==;
X-IronPort-AV: E=McAfee;i="6600,9927,10764"; a="427683440"
X-IronPort-AV: E=Sophos;i="6.01,189,1684825200";
   d="scan'208";a="427683440"
Received: from orsmga007.jf.intel.com ([10.7.209.58])
  by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 07 Jul 2023 15:56:53 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=McAfee;i="6600,9927,10764"; a="714176657"
X-IronPort-AV: E=Sophos;i="6.01,189,1684825200";
   d="scan'208";a="714176657"
Received: from b04f130c83f2.jf.intel.com ([10.165.154.98])
  by orsmga007.jf.intel.com with ESMTP; 07 Jul 2023 15:56:53 -0700
From: Tim Chen <tim.c.chen@linux.intel.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Tim C Chen <tim.c.chen@linux.intel.com>,
        Juri Lelli <juri.lelli@redhat.com>,
        Vincent Guittot <vincent.guittot@linaro.org>,
        Ricardo Neri <ricardo.neri@intel.com>,
        "Ravi V . Shankar" <ravi.v.shankar@intel.com>,
        Ben Segall <bsegall@google.com>,
        Daniel Bristot de Oliveira <bristot@redhat.com>,
        Dietmar Eggemann <dietmar.eggemann@arm.com>,
        Len Brown <len.brown@intel.com>, Mel Gorman <mgorman@suse.de>,
        "Rafael J . Wysocki" <rafael.j.wysocki@intel.com>,
        Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>,
        Steven Rostedt <rostedt@goodmis.org>,
        Valentin Schneider <vschneid@redhat.com>,
        Ionela Voinescu <ionela.voinescu@arm.com>, x86@kernel.org,
        linux-kernel@vger.kernel.org,
        Shrikanth Hegde <sshegde@linux.vnet.ibm.com>,
        Srikar Dronamraju <srikar@linux.vnet.ibm.com>,
        naveen.n.rao@linux.vnet.ibm.com,
        Yicong Yang <yangyicong@hisilicon.com>,
        Barry Song <v-songbaohua@oppo.com>,
        Chen Yu <yu.c.chen@intel.com>, Hillf Danton <hdanton@sina.com>
Subject: [Patch v3 1/6] sched/fair: Determine active load balance for SMT
 sched groups
Date: Fri,  7 Jul 2023 15:57:00 -0700
Message-Id: 
 <e24f35d142308790f69be65930b82794ef6658a2.1688770494.git.tim.c.chen@linux.intel.com>
X-Mailer: git-send-email 2.32.0
In-Reply-To: <cover.1688770494.git.tim.c.chen@linux.intel.com>
References: <cover.1688770494.git.tim.c.chen@linux.intel.com>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Type: text/plain; charset="utf-8"

From: Tim C Chen <tim.c.chen@linux.intel.com>

On hybrid CPUs with scheduling cluster enabled, we will need to
consider balancing between SMT CPU cluster, and Atom core cluster.

Below shows such a hybrid x86 CPU with 4 big cores and 8 atom cores.
Each scheduling cluster span a L2 cache.

          --L2-- --L2-- --L2-- --L2-- ----L2---- -----L2------
          [0, 1] [2, 3] [4, 5] [5, 6] [7 8 9 10] [11 12 13 14]
          Big    Big    Big    Big    Atom       Atom
          core   core   core   core   Module     Module

If the busiest group is a big core with both SMT CPUs busy, we should
active load balance if destination group has idle CPU cores.  Such
condition is considered by asym_active_balance() in load balancing but not
considered when looking for busiest group and computing load imbalance.
Add this consideration in find_busiest_group() and calculate_imbalance().

In addition, update the logic determining the busier group when one group
is SMT and the other group is non SMT but both groups are partially busy
with idle CPU. The busier group should be the group with idle cores rather
than the group with one busy SMT CPU.  We do not want to make the SMT group
the busiest one to pull the only task off SMT CPU and causing the whole cor=
e to
go empty.

Otherwise suppose in the search for the busiest group, we first encounter
an SMT group with 1 task and set it as the busiest.  The destination
group is an atom cluster with 1 task and we next encounter an atom
cluster group with 3 tasks, we will not pick this atom cluster over the
SMT group, even though we should.  As a result, we do not load balance
the busier Atom cluster (with 3 tasks) towards the local atom cluster
(with 1 task).  And it doesn't make sense to pick the 1 task SMT group
as the busier group as we also should not pull task off the SMT towards
the 1 task atom cluster and make the SMT core completely empty.

Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
---
 kernel/sched/fair.c | 80 +++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 77 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 87317634fab2..f636d6c09dc6 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -8279,6 +8279,11 @@ enum group_type {
 	 * more powerful CPU.
 	 */
 	group_misfit_task,
+	/*
+	 * Balance SMT group that's fully busy. Can benefit from migration
+	 * a task on SMT with busy sibling to another CPU on idle core.
+	 */
+	group_smt_balance,
 	/*
 	 * SD_ASYM_PACKING only: One local CPU with higher capacity is available,
 	 * and the task should be migrated to it instead of running on the
@@ -8987,6 +8992,7 @@ struct sg_lb_stats {
 	unsigned int group_weight;
 	enum group_type group_type;
 	unsigned int group_asym_packing; /* Tasks should be moved to preferred CP=
U */
+	unsigned int group_smt_balance;  /* Task on busy SMT be moved */
 	unsigned long group_misfit_task_load; /* A CPU has a task too big for its=
 capacity */
 #ifdef CONFIG_NUMA_BALANCING
 	unsigned int nr_numa_running;
@@ -9260,6 +9266,9 @@ group_type group_classify(unsigned int imbalance_pct,
 	if (sgs->group_asym_packing)
 		return group_asym_packing;
=20
+	if (sgs->group_smt_balance)
+		return group_smt_balance;
+
 	if (sgs->group_misfit_task_load)
 		return group_misfit_task;
=20
@@ -9333,6 +9342,36 @@ sched_asym(struct lb_env *env, struct sd_lb_stats *s=
ds,  struct sg_lb_stats *sgs
 	return sched_asym_prefer(env->dst_cpu, group->asym_prefer_cpu);
 }
=20
+/* One group has more than one SMT CPU while the other group does not */
+static inline bool smt_vs_nonsmt_groups(struct sched_group *sg1,
+				    struct sched_group *sg2)
+{
+	if (!sg1 || !sg2)
+		return false;
+
+	return (sg1->flags & SD_SHARE_CPUCAPACITY) !=3D
+		(sg2->flags & SD_SHARE_CPUCAPACITY);
+}
+
+static inline bool smt_balance(struct lb_env *env, struct sg_lb_stats *sgs,
+			       struct sched_group *group)
+{
+	if (env->idle =3D=3D CPU_NOT_IDLE)
+		return false;
+
+	/*
+	 * For SMT source group, it is better to move a task
+	 * to a CPU that doesn't have multiple tasks sharing its CPU capacity.
+	 * Note that if a group has a single SMT, SD_SHARE_CPUCAPACITY
+	 * will not be on.
+	 */
+	if (group->flags & SD_SHARE_CPUCAPACITY &&
+	    sgs->sum_h_nr_running > 1)
+		return true;
+
+	return false;
+}
+
 static inline bool
 sched_reduced_capacity(struct rq *rq, struct sched_domain *sd)
 {
@@ -9425,6 +9464,10 @@ static inline void update_sg_lb_stats(struct lb_env =
*env,
 		sgs->group_asym_packing =3D 1;
 	}
=20
+	/* Check for loaded SMT group to be balanced to dst CPU */
+	if (!local_group && smt_balance(env, sgs, group))
+		sgs->group_smt_balance =3D 1;
+
 	sgs->group_type =3D group_classify(env->sd->imbalance_pct, group, sgs);
=20
 	/* Computing avg_load makes sense only when group is overloaded */
@@ -9509,6 +9552,7 @@ static bool update_sd_pick_busiest(struct lb_env *env,
 			return false;
 		break;
=20
+	case group_smt_balance:
 	case group_fully_busy:
 		/*
 		 * Select the fully busy group with highest avg_load. In
@@ -9537,6 +9581,18 @@ static bool update_sd_pick_busiest(struct lb_env *en=
v,
 		break;
=20
 	case group_has_spare:
+		/*
+		 * Do not pick sg with SMT CPUs over sg with pure CPUs,
+		 * as we do not want to pull task off SMT core with one task
+		 * and make the core idle.
+		 */
+		if (smt_vs_nonsmt_groups(sds->busiest, sg)) {
+			if (sg->flags & SD_SHARE_CPUCAPACITY && sgs->sum_h_nr_running <=3D 1)
+				return false;
+			else
+				return true;
+		}
+
 		/*
 		 * Select not overloaded group with lowest number of idle cpus
 		 * and highest number of running tasks. We could also compare
@@ -9733,6 +9789,7 @@ static bool update_pick_idlest(struct sched_group *id=
lest,
=20
 	case group_imbalanced:
 	case group_asym_packing:
+	case group_smt_balance:
 		/* Those types are not used in the slow wakeup path */
 		return false;
=20
@@ -9864,6 +9921,7 @@ find_idlest_group(struct sched_domain *sd, struct tas=
k_struct *p, int this_cpu)
=20
 	case group_imbalanced:
 	case group_asym_packing:
+	case group_smt_balance:
 		/* Those type are not used in the slow wakeup path */
 		return NULL;
=20
@@ -10118,6 +10176,13 @@ static inline void calculate_imbalance(struct lb_e=
nv *env, struct sd_lb_stats *s
 		return;
 	}
=20
+	if (busiest->group_type =3D=3D group_smt_balance) {
+		/* Reduce number of tasks sharing CPU capacity */
+		env->migration_type =3D migrate_task;
+		env->imbalance =3D 1;
+		return;
+	}
+
 	if (busiest->group_type =3D=3D group_imbalanced) {
 		/*
 		 * In the group_imb case we cannot rely on group-wide averages
@@ -10363,16 +10428,23 @@ static struct sched_group *find_busiest_group(str=
uct lb_env *env)
 		goto force_balance;
=20
 	if (busiest->group_type !=3D group_overloaded) {
-		if (env->idle =3D=3D CPU_NOT_IDLE)
+		if (env->idle =3D=3D CPU_NOT_IDLE) {
 			/*
 			 * If the busiest group is not overloaded (and as a
 			 * result the local one too) but this CPU is already
 			 * busy, let another idle CPU try to pull task.
 			 */
 			goto out_balanced;
+		}
+
+		if (busiest->group_type =3D=3D group_smt_balance &&
+		    smt_vs_nonsmt_groups(sds.local, sds.busiest)) {
+			/* Let non SMT CPU pull from SMT CPU sharing with sibling */
+			goto force_balance;
+		}
=20
 		if (busiest->group_weight > 1 &&
-		    local->idle_cpus <=3D (busiest->idle_cpus + 1))
+		    local->idle_cpus <=3D (busiest->idle_cpus + 1)) {
 			/*
 			 * If the busiest group is not overloaded
 			 * and there is no imbalance between this and busiest
@@ -10383,12 +10455,14 @@ static struct sched_group *find_busiest_group(str=
uct lb_env *env)
 			 * there is more than 1 CPU per group.
 			 */
 			goto out_balanced;
+		}
=20
-		if (busiest->sum_h_nr_running =3D=3D 1)
+		if (busiest->sum_h_nr_running =3D=3D 1) {
 			/*
 			 * busiest doesn't have any tasks waiting to run
 			 */
 			goto out_balanced;
+		}
 	}
=20
 force_balance:
--=20
2.32.0
From nobody Wed Feb 11 10:38:05 2026
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 29BA6EB64D9
	for <linux-kernel@archiver.kernel.org>; Fri,  7 Jul 2023 22:57:08 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S229969AbjGGW5G (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Fri, 7 Jul 2023 18:57:06 -0400
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60486 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S231245AbjGGW4z (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Fri, 7 Jul 2023 18:56:55 -0400
Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C2DCC1997
        for <linux-kernel@vger.kernel.org>;
 Fri,  7 Jul 2023 15:56:54 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple;
  d=intel.com; i=@intel.com; q=dns/txt; s=Intel;
  t=1688770614; x=1720306614;
  h=from:to:cc:subject:date:message-id:in-reply-to:
   references:mime-version:content-transfer-encoding;
  bh=GtHT20V983sr8n3O3e7GEiqoyfK5vMEHBMPj3byjlrM=;
  b=DW44dRz7PP7qXGzmS0qu36AVxPOX/tWHCYhmHNNhwBvXgo02ykRyF853
   +KJ0gjLdLL7cT9Gx4OXQfcNWcWvTXRLdh3BhgG4+UleE7IbUmb2nYmfcE
   HvF0z40i1c/wwagL/P1FqKleyPzyznrO7VkgKy6qsUhGbRBIJJURQ2pXY
   9jSX97WBjxejfoHbgBZfKVzgMovp82Ia1GKrvvleuvOBSgui43EoBjZt1
   GuIhAhZxFhX8FBBvO2CnGNDlBQAcuZ+6JaF0iM+3RXbFllfSOS6UizjGP
   mC651krSCGxP0tjfX3mhjg8nCx22KmGcvt9ZyJfyf31wowwmTkywVbYXC
   w==;
X-IronPort-AV: E=McAfee;i="6600,9927,10764"; a="427683460"
X-IronPort-AV: E=Sophos;i="6.01,189,1684825200";
   d="scan'208";a="427683460"
Received: from orsmga007.jf.intel.com ([10.7.209.58])
  by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 07 Jul 2023 15:56:54 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=McAfee;i="6600,9927,10764"; a="714176664"
X-IronPort-AV: E=Sophos;i="6.01,189,1684825200";
   d="scan'208";a="714176664"
Received: from b04f130c83f2.jf.intel.com ([10.165.154.98])
  by orsmga007.jf.intel.com with ESMTP; 07 Jul 2023 15:56:54 -0700
From: Tim Chen <tim.c.chen@linux.intel.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Tim C Chen <tim.c.chen@linux.intel.com>,
        Juri Lelli <juri.lelli@redhat.com>,
        Vincent Guittot <vincent.guittot@linaro.org>,
        Ricardo Neri <ricardo.neri@intel.com>,
        "Ravi V . Shankar" <ravi.v.shankar@intel.com>,
        Ben Segall <bsegall@google.com>,
        Daniel Bristot de Oliveira <bristot@redhat.com>,
        Dietmar Eggemann <dietmar.eggemann@arm.com>,
        Len Brown <len.brown@intel.com>, Mel Gorman <mgorman@suse.de>,
        "Rafael J . Wysocki" <rafael.j.wysocki@intel.com>,
        Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>,
        Steven Rostedt <rostedt@goodmis.org>,
        Valentin Schneider <vschneid@redhat.com>,
        Ionela Voinescu <ionela.voinescu@arm.com>, x86@kernel.org,
        linux-kernel@vger.kernel.org,
        Shrikanth Hegde <sshegde@linux.vnet.ibm.com>,
        Srikar Dronamraju <srikar@linux.vnet.ibm.com>,
        naveen.n.rao@linux.vnet.ibm.com,
        Yicong Yang <yangyicong@hisilicon.com>,
        Barry Song <v-songbaohua@oppo.com>,
        Chen Yu <yu.c.chen@intel.com>, Hillf Danton <hdanton@sina.com>
Subject: [Patch v3 2/6] sched/topology: Record number of cores in sched group
Date: Fri,  7 Jul 2023 15:57:01 -0700
Message-Id: 
 <04641eeb0e95c21224352f5743ecb93dfac44654.1688770494.git.tim.c.chen@linux.intel.com>
X-Mailer: git-send-email 2.32.0
In-Reply-To: <cover.1688770494.git.tim.c.chen@linux.intel.com>
References: <cover.1688770494.git.tim.c.chen@linux.intel.com>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Type: text/plain; charset="utf-8"

From: Tim C Chen <tim.c.chen@linux.intel.com>

When balancing sibling domains that have different number of cores,
tasks in respective sibling domain should be proportional to the number
of cores in each domain. In preparation of implementing such a policy,
record the number of tasks in a scheduling group.

Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Reviewed-by: Valentin Schneider <vschneid@redhat.com>
---
 kernel/sched/sched.h    |  1 +
 kernel/sched/topology.c | 10 +++++++++-
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 3d0eb36350d2..5f7f36e45b87 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1860,6 +1860,7 @@ struct sched_group {
 	atomic_t		ref;
=20
 	unsigned int		group_weight;
+	unsigned int		cores;
 	struct sched_group_capacity *sgc;
 	int			asym_prefer_cpu;	/* CPU of highest priority in group */
 	int			flags;
diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index 6d5628fcebcf..6b099dbdfb39 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -1275,14 +1275,22 @@ build_sched_groups(struct sched_domain *sd, int cpu)
 static void init_sched_groups_capacity(int cpu, struct sched_domain *sd)
 {
 	struct sched_group *sg =3D sd->groups;
+	struct cpumask *mask =3D sched_domains_tmpmask2;
=20
 	WARN_ON(!sg);
=20
 	do {
-		int cpu, max_cpu =3D -1;
+		int cpu, cores =3D 0, max_cpu =3D -1;
=20
 		sg->group_weight =3D cpumask_weight(sched_group_span(sg));
=20
+		cpumask_copy(mask, sched_group_span(sg));
+		for_each_cpu(cpu, mask) {
+			cores++;
+			cpumask_andnot(mask, mask, cpu_smt_mask(cpu));
+		}
+		sg->cores =3D cores;
+
 		if (!(sd->flags & SD_ASYM_PACKING))
 			goto next;
=20
--=20
2.32.0
From nobody Wed Feb 11 10:38:05 2026
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id E8B21EB64DA
	for <linux-kernel@archiver.kernel.org>; Fri,  7 Jul 2023 22:57:10 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S230040AbjGGW5J (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Fri, 7 Jul 2023 18:57:09 -0400
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60492 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S231617AbjGGW44 (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Fri, 7 Jul 2023 18:56:56 -0400
Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F17611992
        for <linux-kernel@vger.kernel.org>;
 Fri,  7 Jul 2023 15:56:55 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple;
  d=intel.com; i=@intel.com; q=dns/txt; s=Intel;
  t=1688770615; x=1720306615;
  h=from:to:cc:subject:date:message-id:in-reply-to:
   references:mime-version:content-transfer-encoding;
  bh=FpAHkI8xLSWqEP7PVVGD4WuKrAVFt/AYDCvhVA2Njec=;
  b=YNpKETcjCbhcRR9uwK34ZQCicxfZrrR9/UvFNZs3uChyx4vqqf9V7qbC
   kijiWvwgWePN/eWnw+aN0aVmufoVccOR8eEZN/gt1s5Y+GhTPfiMUabLZ
   2hGvnMsbJUfoB7ONab8LU/TIjnd+qo+E0xDL8r1g2wC5jSummbgk4aNFT
   MpQealJrJwZCQkyEhx79wBmZQv9CMoNoMmOL4qALAbgTM+79hP9moD/li
   EX0RasOpts7MDtTkjZW0kkwCMih4vyyPLsDEnAvaN5leDtk3w/K5gQMPB
   NOwYW1i7DLccCfWa9Oa0sN+gsyVNZ9zDobNhKIgpQjvEEdFERMrk5gwl0
   Q==;
X-IronPort-AV: E=McAfee;i="6600,9927,10764"; a="427683479"
X-IronPort-AV: E=Sophos;i="6.01,189,1684825200";
   d="scan'208";a="427683479"
Received: from orsmga007.jf.intel.com ([10.7.209.58])
  by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 07 Jul 2023 15:56:55 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=McAfee;i="6600,9927,10764"; a="714176670"
X-IronPort-AV: E=Sophos;i="6.01,189,1684825200";
   d="scan'208";a="714176670"
Received: from b04f130c83f2.jf.intel.com ([10.165.154.98])
  by orsmga007.jf.intel.com with ESMTP; 07 Jul 2023 15:56:55 -0700
From: Tim Chen <tim.c.chen@linux.intel.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Tim C Chen <tim.c.chen@linux.intel.com>,
        Juri Lelli <juri.lelli@redhat.com>,
        Vincent Guittot <vincent.guittot@linaro.org>,
        Ricardo Neri <ricardo.neri@intel.com>,
        "Ravi V . Shankar" <ravi.v.shankar@intel.com>,
        Ben Segall <bsegall@google.com>,
        Daniel Bristot de Oliveira <bristot@redhat.com>,
        Dietmar Eggemann <dietmar.eggemann@arm.com>,
        Len Brown <len.brown@intel.com>, Mel Gorman <mgorman@suse.de>,
        "Rafael J . Wysocki" <rafael.j.wysocki@intel.com>,
        Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>,
        Steven Rostedt <rostedt@goodmis.org>,
        Valentin Schneider <vschneid@redhat.com>,
        Ionela Voinescu <ionela.voinescu@arm.com>, x86@kernel.org,
        linux-kernel@vger.kernel.org,
        Shrikanth Hegde <sshegde@linux.vnet.ibm.com>,
        Srikar Dronamraju <srikar@linux.vnet.ibm.com>,
        naveen.n.rao@linux.vnet.ibm.com,
        Yicong Yang <yangyicong@hisilicon.com>,
        Barry Song <v-songbaohua@oppo.com>,
        Chen Yu <yu.c.chen@intel.com>, Hillf Danton <hdanton@sina.com>
Subject: [Patch v3 3/6] sched/fair: Implement prefer sibling imbalance
 calculation between asymmetric groups
Date: Fri,  7 Jul 2023 15:57:02 -0700
Message-Id: 
 <4eacbaa236e680687dae2958378a6173654113df.1688770494.git.tim.c.chen@linux.intel.com>
X-Mailer: git-send-email 2.32.0
In-Reply-To: <cover.1688770494.git.tim.c.chen@linux.intel.com>
References: <cover.1688770494.git.tim.c.chen@linux.intel.com>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Type: text/plain; charset="utf-8"

From: Tim C Chen <tim.c.chen@linux.intel.com>

In the current prefer sibling load balancing code, there is an implicit
assumption that the busiest sched group and local sched group are
equivalent, hence the tasks to be moved is simply the difference in
number of tasks between the two groups (i.e. imbalance) divided by two.

However, we may have different number of cores between the cluster groups,
say when we take CPU offline or we have hybrid groups.  In that case,
we should balance between the two groups such that #tasks/#cores ratio
is the same between the same between both groups.  Hence the imbalance
computed will need to reflect this.

Adjust the sibling imbalance computation to take into account of the
above considerations.

Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
---
 kernel/sched/fair.c | 41 +++++++++++++++++++++++++++++++++++++----
 1 file changed, 37 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index f636d6c09dc6..f491b94908bf 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9372,6 +9372,41 @@ static inline bool smt_balance(struct lb_env *env, s=
truct sg_lb_stats *sgs,
 	return false;
 }
=20
+static inline long sibling_imbalance(struct lb_env *env,
+				    struct sd_lb_stats *sds,
+				    struct sg_lb_stats *busiest,
+				    struct sg_lb_stats *local)
+{
+	int ncores_busiest, ncores_local;
+	long imbalance;
+
+	if (env->idle =3D=3D CPU_NOT_IDLE || !busiest->sum_nr_running)
+		return 0;
+
+	ncores_busiest =3D sds->busiest->cores;
+	ncores_local =3D sds->local->cores;
+
+	if (ncores_busiest =3D=3D ncores_local) {
+		imbalance =3D busiest->sum_nr_running;
+		lsub_positive(&imbalance, local->sum_nr_running);
+		return imbalance;
+	}
+
+	/* Balance such that nr_running/ncores ratio are same on both groups */
+	imbalance =3D ncores_local * busiest->sum_nr_running;
+	lsub_positive(&imbalance, ncores_busiest * local->sum_nr_running);
+	/* Normalize imbalance and do rounding on normalization */
+	imbalance =3D 2 * imbalance + ncores_local + ncores_busiest;
+	imbalance /=3D ncores_local + ncores_busiest;
+
+	/* Take advantage of resource in an empty sched group */
+	if (imbalance =3D=3D 0 && local->sum_nr_running =3D=3D 0 &&
+	    busiest->sum_nr_running > 1)
+		imbalance =3D 2;
+
+	return imbalance;
+}
+
 static inline bool
 sched_reduced_capacity(struct rq *rq, struct sched_domain *sd)
 {
@@ -10230,14 +10265,12 @@ static inline void calculate_imbalance(struct lb_=
env *env, struct sd_lb_stats *s
 		}
=20
 		if (busiest->group_weight =3D=3D 1 || sds->prefer_sibling) {
-			unsigned int nr_diff =3D busiest->sum_nr_running;
 			/*
 			 * When prefer sibling, evenly spread running tasks on
 			 * groups.
 			 */
 			env->migration_type =3D migrate_task;
-			lsub_positive(&nr_diff, local->sum_nr_running);
-			env->imbalance =3D nr_diff;
+			env->imbalance =3D sibling_imbalance(env, sds, busiest, local);
 		} else {
=20
 			/*
@@ -10424,7 +10457,7 @@ static struct sched_group *find_busiest_group(struc=
t lb_env *env)
 	 * group's child domain.
 	 */
 	if (sds.prefer_sibling && local->group_type =3D=3D group_has_spare &&
-	    busiest->sum_nr_running > local->sum_nr_running + 1)
+	    sibling_imbalance(env, &sds, busiest, local) > 1)
 		goto force_balance;
=20
 	if (busiest->group_type !=3D group_overloaded) {
--=20
2.32.0
From nobody Wed Feb 11 10:38:05 2026
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 6D0DAEB64DA
	for <linux-kernel@archiver.kernel.org>; Fri,  7 Jul 2023 22:57:14 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S232526AbjGGW5M (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Fri, 7 Jul 2023 18:57:12 -0400
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60502 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S229773AbjGGW45 (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Fri, 7 Jul 2023 18:56:57 -0400
Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 01FDB1997
        for <linux-kernel@vger.kernel.org>;
 Fri,  7 Jul 2023 15:56:56 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple;
  d=intel.com; i=@intel.com; q=dns/txt; s=Intel;
  t=1688770616; x=1720306616;
  h=from:to:cc:subject:date:message-id:in-reply-to:
   references:mime-version:content-transfer-encoding;
  bh=5zfJiUCxKx+vvs3EQDbGAx5BcCZrMXvlpEj7Tybxxd8=;
  b=RPNliaAUxakcnm1hZc90AvVq+VEx2qAAeCE21roAL/8WcWRXnfDcmuUj
   hRnkV/MJOBRyP3TJOL9f5qf3NzTEag2VohqyrSv/BXx/Ugrq2mjrErAx5
   PWrvk2CSCn4hdth7c75NYumyjkARMADJEjxzQJFt+Avdjqg4YJgaZtpUI
   YGvKxOXwHJ1R35sunOQCGPm61+2X/VhHGsczeVKHojRQuvbuzW/yvMjtt
   AIaQs+GIkypSGJWfsYQg0vNC2Y3w6ldqx9O5s3KuK8Fack1w5LfukAVQm
   K8saeyuF47VQhzpw414mzCkBN1D6kmZBqrwCc1q7xFNFIT0wsHRe0fkH9
   Q==;
X-IronPort-AV: E=McAfee;i="6600,9927,10764"; a="427683492"
X-IronPort-AV: E=Sophos;i="6.01,189,1684825200";
   d="scan'208";a="427683492"
Received: from orsmga007.jf.intel.com ([10.7.209.58])
  by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 07 Jul 2023 15:56:56 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=McAfee;i="6600,9927,10764"; a="714176674"
X-IronPort-AV: E=Sophos;i="6.01,189,1684825200";
   d="scan'208";a="714176674"
Received: from b04f130c83f2.jf.intel.com ([10.165.154.98])
  by orsmga007.jf.intel.com with ESMTP; 07 Jul 2023 15:56:56 -0700
From: Tim Chen <tim.c.chen@linux.intel.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>,
        Juri Lelli <juri.lelli@redhat.com>,
        Vincent Guittot <vincent.guittot@linaro.org>,
        Ricardo Neri <ricardo.neri@intel.com>,
        "Ravi V . Shankar" <ravi.v.shankar@intel.com>,
        Ben Segall <bsegall@google.com>,
        Daniel Bristot de Oliveira <bristot@redhat.com>,
        Dietmar Eggemann <dietmar.eggemann@arm.com>,
        Len Brown <len.brown@intel.com>, Mel Gorman <mgorman@suse.de>,
        "Rafael J . Wysocki" <rafael.j.wysocki@intel.com>,
        Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>,
        Steven Rostedt <rostedt@goodmis.org>,
        Tim Chen <tim.c.chen@linux.intel.com>,
        Valentin Schneider <vschneid@redhat.com>,
        Ionela Voinescu <ionela.voinescu@arm.com>, x86@kernel.org,
        linux-kernel@vger.kernel.org,
        Shrikanth Hegde <sshegde@linux.vnet.ibm.com>,
        Srikar Dronamraju <srikar@linux.vnet.ibm.com>,
        naveen.n.rao@linux.vnet.ibm.com,
        Yicong Yang <yangyicong@hisilicon.com>,
        Barry Song <v-songbaohua@oppo.com>,
        Chen Yu <yu.c.chen@intel.com>, Hillf Danton <hdanton@sina.com>
Subject: [Patch v3 4/6] sched/fair: Consider the idle state of the whole core
 for load balance
Date: Fri,  7 Jul 2023 15:57:03 -0700
Message-Id: 
 <807bdd05331378ea3bf5956bda87ded1036ba769.1688770494.git.tim.c.chen@linux.intel.com>
X-Mailer: git-send-email 2.32.0
In-Reply-To: <cover.1688770494.git.tim.c.chen@linux.intel.com>
References: <cover.1688770494.git.tim.c.chen@linux.intel.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

From: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>

should_we_balance() traverses the group_balance_mask (AND'ed with lb_env::
cpus) starting from lower numbered CPUs looking for the first idle CPU.

In hybrid x86 systems, the siblings of SMT cores get CPU numbers, before
non-SMT cores:

	[0, 1] [2, 3] [4, 5] 6 7 8 9
         b  i   b  i   b  i  b i i i

In the figure above, CPUs in brackets are siblings of an SMT core. The
rest are non-SMT cores. 'b' indicates a busy CPU, 'i' indicates an
idle CPU.

We should let a CPU on a fully idle core get the first chance to idle
load balance as it has more CPU capacity than a CPU on an idle SMT
CPU with busy sibling.  So for the figure above, if we are running
should_we_balance() to CPU 1, we should return false to let CPU 7 on
idle core to have a chance first to idle load balance.

A partially busy (i.e., of type group_has_spare) local group with SMT=C2=A0
cores will often have only one SMT sibling busy.=C2=A0If the destination CPU
is a non-SMT core, partially busy, lower-numbered, SMT cores should not
be considered when finding the first idle CPU.=C2=A0

However, in should_we_balance(), when we encounter idle SMT first in partia=
lly
busy core, we prematurely break the search for the first idle CPU.

Higher-numbered, non-SMT cores is not given the chance to have
idle balance done on their behalf. Those CPUs will only be considered
for idle balancing by chance via CPU_NEWLY_IDLE.

Instead, consider the idle state of the whole SMT core.

Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Co-developed-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
---
 kernel/sched/fair.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index f491b94908bf..294a662c9410 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -10729,7 +10729,7 @@ static int active_load_balance_cpu_stop(void *data);
 static int should_we_balance(struct lb_env *env)
 {
 	struct sched_group *sg =3D env->sd->groups;
-	int cpu;
+	int cpu, idle_smt =3D -1;
=20
 	/*
 	 * Ensure the balancing environment is consistent; can happen
@@ -10756,10 +10756,24 @@ static int should_we_balance(struct lb_env *env)
 		if (!idle_cpu(cpu))
 			continue;
=20
+		/*
+		 * Don't balance to idle SMT in busy core right away when
+		 * balancing cores, but remember the first idle SMT CPU for
+		 * later consideration.  Find CPU on an idle core first.
+		 */
+		if (!(env->sd->flags & SD_SHARE_CPUCAPACITY) && !is_core_idle(cpu)) {
+			if (idle_smt =3D=3D -1)
+				idle_smt =3D cpu;
+			continue;
+		}
+
 		/* Are we the first idle CPU? */
 		return cpu =3D=3D env->dst_cpu;
 	}
=20
+	if (idle_smt =3D=3D env->dst_cpu)
+		return true;
+
 	/* Are we the first CPU of this group ? */
 	return group_balance_cpu(sg) =3D=3D env->dst_cpu;
 }
--=20
2.32.0

From nobody Wed Feb 11 10:38:05 2026
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id B9337C001B0
	for <linux-kernel@archiver.kernel.org>; Fri,  7 Jul 2023 22:57:17 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S232666AbjGGW5P (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Fri, 7 Jul 2023 18:57:15 -0400
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60504 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S231799AbjGGW46 (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Fri, 7 Jul 2023 18:56:58 -0400
Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A0F011992
        for <linux-kernel@vger.kernel.org>;
 Fri,  7 Jul 2023 15:56:57 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple;
  d=intel.com; i=@intel.com; q=dns/txt; s=Intel;
  t=1688770617; x=1720306617;
  h=from:to:cc:subject:date:message-id:in-reply-to:
   references:mime-version:content-transfer-encoding;
  bh=40RBK7UEzMfh6s+/yIJhkN9+deAVJi5T19wOPusP9aM=;
  b=OBYCgSmV8jNink+T4OFXEL4Qi4vKbSwNyZuOHGmpKgNohAztcWOzkbLJ
   QSyeQPjeLDh4+2RoTMd3/skmo9e2Oq1KRck+8pWfLu9qriFwh/X6GDkap
   l7Z9Iwap6xc5HvSgbOtCGWmD7vNoL7v0LARkkLM3hG21mFj8g6Jnk1U+3
   i4/dKeHs28Cuan0Gog0tEBf6FhPQ70KyGQ6JoYY040imVIClEFwqGlgft
   QUrTW9zZN+cyjrOm8q7moeL6Iq2i8kVJEZ9++Nxn3jdb8ZnQBYR7qVPwQ
   IS5bhvtbMrEDRTozvaYMswpLBw6JtF0/BILB2mjZBZ03vvRB5etr8QITE
   Q==;
X-IronPort-AV: E=McAfee;i="6600,9927,10764"; a="427683505"
X-IronPort-AV: E=Sophos;i="6.01,189,1684825200";
   d="scan'208";a="427683505"
Received: from orsmga007.jf.intel.com ([10.7.209.58])
  by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 07 Jul 2023 15:56:57 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=McAfee;i="6600,9927,10764"; a="714176681"
X-IronPort-AV: E=Sophos;i="6.01,189,1684825200";
   d="scan'208";a="714176681"
Received: from b04f130c83f2.jf.intel.com ([10.165.154.98])
  by orsmga007.jf.intel.com with ESMTP; 07 Jul 2023 15:56:57 -0700
From: Tim Chen <tim.c.chen@linux.intel.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Tim C Chen <tim.c.chen@linux.intel.com>,
        Juri Lelli <juri.lelli@redhat.com>,
        Vincent Guittot <vincent.guittot@linaro.org>,
        Ricardo Neri <ricardo.neri@intel.com>,
        "Ravi V . Shankar" <ravi.v.shankar@intel.com>,
        Ben Segall <bsegall@google.com>,
        Daniel Bristot de Oliveira <bristot@redhat.com>,
        Dietmar Eggemann <dietmar.eggemann@arm.com>,
        Len Brown <len.brown@intel.com>, Mel Gorman <mgorman@suse.de>,
        "Rafael J . Wysocki" <rafael.j.wysocki@intel.com>,
        Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>,
        Steven Rostedt <rostedt@goodmis.org>,
        Valentin Schneider <vschneid@redhat.com>,
        Ionela Voinescu <ionela.voinescu@arm.com>, x86@kernel.org,
        linux-kernel@vger.kernel.org,
        Shrikanth Hegde <sshegde@linux.vnet.ibm.com>,
        Srikar Dronamraju <srikar@linux.vnet.ibm.com>,
        naveen.n.rao@linux.vnet.ibm.com,
        Yicong Yang <yangyicong@hisilicon.com>,
        Barry Song <v-songbaohua@oppo.com>,
        Chen Yu <yu.c.chen@intel.com>, Hillf Danton <hdanton@sina.com>,
        Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Subject: [Patch v3 5/6] sched/x86: Add cluster topology to hybrid CPU
Date: Fri,  7 Jul 2023 15:57:04 -0700
Message-Id: 
 <e64678a22c57d572dcac20a0d492d406d56bf881.1688770494.git.tim.c.chen@linux.intel.com>
X-Mailer: git-send-email 2.32.0
In-Reply-To: <cover.1688770494.git.tim.c.chen@linux.intel.com>
References: <cover.1688770494.git.tim.c.chen@linux.intel.com>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Type: text/plain; charset="utf-8"

From: Tim C Chen <tim.c.chen@linux.intel.com>

Cluster topology was not enabled on hybrid x86 CPU as load balance
was not properly working for cluster domain.  That has been fixed and
cluster domain can be enabled for hybrid CPU.

Reviewed-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
---
 arch/x86/kernel/smpboot.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index cea297d97034..2489d767c398 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -575,6 +575,9 @@ static struct sched_domain_topology_level x86_hybrid_to=
pology[] =3D {
 #ifdef CONFIG_SCHED_SMT
 	{ cpu_smt_mask, x86_smt_flags, SD_INIT_NAME(SMT) },
 #endif
+#ifdef CONFIG_SCHED_CLUSTER
+	{ cpu_clustergroup_mask, x86_cluster_flags, SD_INIT_NAME(CLS) },
+#endif
 #ifdef CONFIG_SCHED_MC
 	{ cpu_coregroup_mask, x86_core_flags, SD_INIT_NAME(MC) },
 #endif
--=20
2.32.0
From nobody Wed Feb 11 10:38:05 2026
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 7B188EB64D9
	for <linux-kernel@archiver.kernel.org>; Fri,  7 Jul 2023 22:57:20 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S229557AbjGGW5T (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Fri, 7 Jul 2023 18:57:19 -0400
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60520 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S232010AbjGGW47 (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Fri, 7 Jul 2023 18:56:59 -0400
Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0088419B7
        for <linux-kernel@vger.kernel.org>;
 Fri,  7 Jul 2023 15:56:58 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple;
  d=intel.com; i=@intel.com; q=dns/txt; s=Intel;
  t=1688770618; x=1720306618;
  h=from:to:cc:subject:date:message-id:in-reply-to:
   references:mime-version:content-transfer-encoding;
  bh=862H8iEOdr8bBhkOKfZF+tLzXIXXbB5+ndLrLrloY0I=;
  b=gSTxPZ7sK6/tTXb4V1TnhFoIl+9cKmjAKlGUF/FyHvrJ66C35qHoYfle
   aBnDhCmw0MoWYmZdu6P2nK3GJk+7hYllOw0Ubf3qA0yZosb2wAfLOFddP
   0T2XYFnWrp8ToT0WInUzyn6QpSK3IxJaQIuRi5sbpg96cLVA3143lLvJu
   vmUcdi/oLvJf7gMYU6dgp1deRuyBzmYyW+wt360rGSUXPbxm79B2LNpkW
   qjBo2/fli9UVw9TG7dEmagP93W/y4LXAFzoBcJ4vzBvrcWDU8Dqay372N
   cy9vzyjHoH6Lk0gs2ZgIBpFTZepMJtMtfNgTnaVz44okQtHSXXAG8hqtk
   w==;
X-IronPort-AV: E=McAfee;i="6600,9927,10764"; a="427683523"
X-IronPort-AV: E=Sophos;i="6.01,189,1684825200";
   d="scan'208";a="427683523"
Received: from orsmga007.jf.intel.com ([10.7.209.58])
  by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 07 Jul 2023 15:56:58 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=McAfee;i="6600,9927,10764"; a="714176687"
X-IronPort-AV: E=Sophos;i="6.01,189,1684825200";
   d="scan'208";a="714176687"
Received: from b04f130c83f2.jf.intel.com ([10.165.154.98])
  by orsmga007.jf.intel.com with ESMTP; 07 Jul 2023 15:56:58 -0700
From: Tim Chen <tim.c.chen@linux.intel.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@redhat.com>,
        Vincent Guittot <vincent.guittot@linaro.org>,
        Ricardo Neri <ricardo.neri@intel.com>,
        "Ravi V . Shankar" <ravi.v.shankar@intel.com>,
        Ben Segall <bsegall@google.com>,
        Daniel Bristot de Oliveira <bristot@redhat.com>,
        Dietmar Eggemann <dietmar.eggemann@arm.com>,
        Len Brown <len.brown@intel.com>, Mel Gorman <mgorman@suse.de>,
        "Rafael J . Wysocki" <rafael.j.wysocki@intel.com>,
        Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>,
        Steven Rostedt <rostedt@goodmis.org>,
        Tim Chen <tim.c.chen@linux.intel.com>,
        Valentin Schneider <vschneid@redhat.com>,
        Ionela Voinescu <ionela.voinescu@arm.com>, x86@kernel.org,
        linux-kernel@vger.kernel.org,
        Shrikanth Hegde <sshegde@linux.vnet.ibm.com>,
        Srikar Dronamraju <srikar@linux.vnet.ibm.com>,
        naveen.n.rao@linux.vnet.ibm.com,
        Yicong Yang <yangyicong@hisilicon.com>,
        Barry Song <v-songbaohua@oppo.com>,
        Chen Yu <yu.c.chen@intel.com>, Hillf Danton <hdanton@sina.com>
Subject: [Patch v3 6/6] sched/debug: Dump domains' sched group flags
Date: Fri,  7 Jul 2023 15:57:05 -0700
Message-Id: 
 <ed1749262d94d95a8296c86a415999eda90bcfe3.1688770494.git.tim.c.chen@linux.intel.com>
X-Mailer: git-send-email 2.32.0
In-Reply-To: <cover.1688770494.git.tim.c.chen@linux.intel.com>
References: <cover.1688770494.git.tim.c.chen@linux.intel.com>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Type: text/plain; charset="utf-8"

From: "Peter Zijlstra (Intel)" <peterz@infradead.org>

There have been a case where the SD_SHARE_CPUCAPACITY sched group flag
in a parent domain were not set and propagated properly when a degenerate
domain is removed.

Add dump of domain sched group flags of a CPU to make debug easier
in the future.

Usage:
cat /debug/sched/domains/cpu0/domain1/groups_flags
to dump cpu0 domain1's sched group flags.

Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
---
 kernel/sched/debug.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
index 1637b65ba07a..55b50f940feb 100644
--- a/kernel/sched/debug.c
+++ b/kernel/sched/debug.c
@@ -389,6 +389,7 @@ static void register_sd(struct sched_domain *sd, struct=
 dentry *parent)
 #undef SDM
=20
 	debugfs_create_file("flags", 0444, parent, &sd->flags, &sd_flags_fops);
+	debugfs_create_file("groups_flags", 0444, parent, &sd->groups->flags, &sd=
_flags_fops);
 }
=20
 void update_sched_domain_debugfs(void)
--=20
2.32.0