From nobody Thu Apr  2 17:10:29 2026
Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.9])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id D6E17336EC3
	for <linux-kernel@vger.kernel.org>; Tue, 10 Feb 2026 22:13:02 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
 arc=none smtp.client-ip=192.198.163.9
ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1770761584; cv=none;
 b=HPF9HVepaYVIkCeEL6fDOLvHKHe7zd6SDICGTb7VjUN3zfe9uoBmtvahOfr3tVuKwwQnDRFNLgubyTKmxpGgWHPxh+5gynghxolqgbUYqBP2LcTD8aBmHRV62JMvJxgI+1Sw3KgmXMDW+DpYDGxLslk7F2nI/ojzt7AXis75rXI=
ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1770761584; c=relaxed/simple;
	bh=y5FA6q54pFBJOrqtRYVJ26W6vHWJ8QZmmJZVJrZ0TTc=;
	h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References:
	 MIME-Version;
 b=dAOUQ1VflrunYboXPxtdWVfslJQF4V9MklqAY3fQYauPIgumuSlzCq+rQx4duykRZWWe3GDFWUFddxyGAeVDwy/mHj5yFenqOCxUnfrciG7M28mfQIOzYKLP8INbk8KRqvF5YhPj+ENeWvBnve3tOl1j1Raqs7BsiLEi06rzFYA=
ARC-Authentication-Results: i=1; smtp.subspace.kernel.org;
 dmarc=pass (p=none dis=none) header.from=linux.intel.com;
 spf=pass smtp.mailfrom=linux.intel.com;
 dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com
 header.b=JLixBCXg; arc=none smtp.client-ip=192.198.163.9
Authentication-Results: smtp.subspace.kernel.org;
 dmarc=pass (p=none dis=none) header.from=linux.intel.com
Authentication-Results: smtp.subspace.kernel.org;
 spf=pass smtp.mailfrom=linux.intel.com
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com
 header.b="JLixBCXg"
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple;
  d=intel.com; i=@intel.com; q=dns/txt; s=Intel;
  t=1770761583; x=1802297583;
  h=from:to:cc:subject:date:message-id:in-reply-to:
   references:mime-version:content-transfer-encoding;
  bh=y5FA6q54pFBJOrqtRYVJ26W6vHWJ8QZmmJZVJrZ0TTc=;
  b=JLixBCXgZGlbzrl00Xm9LR27zuQenB6SR4NZlqVWx3xU5TQCuZfhvXy6
   foO0iXVGh/whwtDjVHS0KtALezhSyQmEVnPS7Fvo3mU0UWjMQ0j4vB7+F
   SGXdV4ndlhBJ2O5ofOqvyEv7zR17ddprHQe7Cd+OD1QObh/N0o+YV8boP
   3Ftv3thW3UPRzPe9vFEtDDN5qUKM1E4vIPCMOgQF90LQJ9CqrbtbYpXbo
   mVlwXADh5L06Zh03CKkOgIo9IWNHheeB4lGp3pcA0Vs/vuQ7JLIG17ncO
   50bKTy+WckeDa9pXVD5RH2wGOXGkYRRKw/EFWkpWylqK8Cr0eSXBKEJOG
   g==;
X-CSE-ConnectionGUID: t+HsXdjHSFm5A1fkVQG4gQ==
X-CSE-MsgGUID: EpIlYTW+QNGZqiEwQbrgWg==
X-IronPort-AV: E=McAfee;i="6800,10657,11697"; a="82631231"
X-IronPort-AV: E=Sophos;i="6.21,283,1763452800";
   d="scan'208";a="82631231"
Received: from fmviesa004.fm.intel.com ([10.60.135.144])
  by fmvoesa103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 10 Feb 2026 14:13:03 -0800
X-CSE-ConnectionGUID: y9jum0wkQ9uvwm0eLtRYbQ==
X-CSE-MsgGUID: K+nx4pxVRtW9Tv7wcOM+YQ==
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="6.21,283,1763452800";
   d="scan'208";a="216373869"
Received: from b04f130c83f2.jf.intel.com ([10.165.154.98])
  by fmviesa004.fm.intel.com with ESMTP; 10 Feb 2026 14:13:01 -0800
From: Tim Chen <tim.c.chen@linux.intel.com>
To: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	K Prateek Nayak <kprateek.nayak@amd.com>,
	"Gautham R . Shenoy" <gautham.shenoy@amd.com>,
	Vincent Guittot <vincent.guittot@linaro.org>
Cc: Chen Yu <yu.c.chen@intel.com>,
	Juri Lelli <juri.lelli@redhat.com>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>,
	Mel Gorman <mgorman@suse.de>,
	Valentin Schneider <vschneid@redhat.com>,
	Madadi Vineeth Reddy <vineethr@linux.ibm.com>,
	Hillf Danton <hdanton@sina.com>,
	Shrikanth Hegde <sshegde@linux.ibm.com>,
	Jianyong Wu <jianyong.wu@outlook.com>,
	Yangyu Chen <cyy@cyyself.name>,
	Tingyin Duan <tingyin.duan@gmail.com>,
	Vern Hao <vernhao@tencent.com>,
	Vern Hao <haoxing990@gmail.com>,
	Len Brown <len.brown@intel.com>,
	Tim Chen <tim.c.chen@linux.intel.com>,
	Aubrey Li <aubrey.li@intel.com>,
	Zhao Liu <zhao1.liu@intel.com>,
	Chen Yu <yu.chen.surf@gmail.com>,
	Adam Li <adamli@os.amperecomputing.com>,
	Aaron Lu <ziqianlu@bytedance.com>,
	Tim Chen <tim.c.chen@intel.com>,
	Josh Don <joshdon@google.com>,
	Gavin Guo <gavinguo@igalia.com>,
	Qais Yousef <qyousef@layalina.io>,
	Libo Chen <libchen@purestorage.com>,
	linux-kernel@vger.kernel.org
Subject: [PATCH v3 02/21] sched/cache: Record per LLC utilization to guide
 cache aware scheduling decisions
Date: Tue, 10 Feb 2026 14:18:42 -0800
Message-Id: 
 <93f0a3958e2398e8b4a05c15cb89f0fd759c5ac9.1770760558.git.tim.c.chen@linux.intel.com>
X-Mailer: git-send-email 2.32.0
In-Reply-To: <cover.1770760558.git.tim.c.chen@linux.intel.com>
References: <cover.1770760558.git.tim.c.chen@linux.intel.com>
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="utf-8"

From: Chen Yu <yu.c.chen@intel.com>

When a system becomes busy and a process's preferred LLC is
saturated with too many threads, tasks within that LLC migrate
frequently. These in LLC migrations introduce latency and degrade
performance. To avoid this, task aggregation should be suppressed
when the preferred LLC is overloaded, which requires a metric to
indicate LLC utilization.

Record per LLC utilization/cpu capacity during periodic load
balancing. These statistics will be used in later patches to decide
whether tasks should be aggregated into their preferred LLC.

Co-developed-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
---

Notes:
    v2->v3:
    Remove ____cacheline_aligned_in_smp attribute in
    struct sched_domain_shared to avoid premature optimization.
    (Peter Zijlstra)

 include/linux/sched/topology.h |  4 ++
 kernel/sched/fair.c            | 70 ++++++++++++++++++++++++++++++++++
 2 files changed, 74 insertions(+)

diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h
index 45c0022b91ce..a4e2fb31f2fd 100644
--- a/include/linux/sched/topology.h
+++ b/include/linux/sched/topology.h
@@ -68,6 +68,10 @@ struct sched_domain_shared {
 	atomic_t	nr_busy_cpus;
 	int		has_idle_cores;
 	int		nr_idle_scan;
+#ifdef CONFIG_SCHED_CACHE
+	unsigned long	util_avg;
+	unsigned long	capacity;
+#endif
 };
=20
 struct sched_domain {
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 58286275e166..dfeb107f2cfd 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9688,6 +9688,29 @@ static inline int task_is_ineligible_on_dst_cpu(stru=
ct task_struct *p, int dest_
 	return 0;
 }
=20
+#ifdef CONFIG_SCHED_CACHE
+/* Called from load balancing paths with rcu_read_lock held */
+static __maybe_unused bool get_llc_stats(int cpu, unsigned long *util,
+					 unsigned long *cap)
+{
+	struct sched_domain_shared *sd_share;
+
+	sd_share =3D rcu_dereference(per_cpu(sd_llc_shared, cpu));
+	if (!sd_share)
+		return false;
+
+	*util =3D READ_ONCE(sd_share->util_avg);
+	*cap =3D READ_ONCE(sd_share->capacity);
+
+	return true;
+}
+#else
+static inline bool get_llc_stats(int cpu, unsigned long *util,
+				 unsigned long *cap)
+{
+	return false;
+}
+#endif
 /*
  * can_migrate_task - may task p from runqueue rq be migrated to this_cpu?
  */
@@ -10658,6 +10681,52 @@ sched_reduced_capacity(struct rq *rq, struct sched=
_domain *sd)
 	return check_cpu_capacity(rq, sd);
 }
=20
+#ifdef CONFIG_SCHED_CACHE
+/*
+ * Record the statistics for this scheduler group for later
+ * use. These values guide load balancing on aggregating tasks
+ * to a LLC.
+ */
+static void record_sg_llc_stats(struct lb_env *env,
+				struct sg_lb_stats *sgs,
+				struct sched_group *group)
+{
+	struct sched_domain_shared *sd_share;
+
+	if (!sched_cache_enabled() || env->idle =3D=3D CPU_NEWLY_IDLE)
+		return;
+
+	/* Only care about sched domain spanning multiple LLCs */
+	if (env->sd->child !=3D rcu_dereference(per_cpu(sd_llc, env->dst_cpu)))
+		return;
+
+	/*
+	 * At this point we know this group spans a LLC domain.
+	 * Record the statistic of this group in its corresponding
+	 * shared LLC domain.
+	 * Note: sd_share cannot be obtained via sd->child->shared,
+	 * because the latter refers to the domain that covers the
+	 * local group. Instead, sd_share should be located using
+	 * the first CPU of the LLC group.
+	 */
+	sd_share =3D rcu_dereference(per_cpu(sd_llc_shared,
+					   cpumask_first(sched_group_span(group))));
+	if (!sd_share)
+		return;
+
+	if (READ_ONCE(sd_share->util_avg) !=3D sgs->group_util)
+		WRITE_ONCE(sd_share->util_avg, sgs->group_util);
+
+	if (unlikely(READ_ONCE(sd_share->capacity) !=3D sgs->group_capacity))
+		WRITE_ONCE(sd_share->capacity, sgs->group_capacity);
+}
+#else
+static inline void record_sg_llc_stats(struct lb_env *env, struct sg_lb_st=
ats *sgs,
+				       struct sched_group *group)
+{
+}
+#endif
+
 /**
  * update_sg_lb_stats - Update sched_group's statistics for load balancing.
  * @env: The load balancing environment.
@@ -10747,6 +10816,7 @@ static inline void update_sg_lb_stats(struct lb_env=
 *env,
=20
 	sgs->group_type =3D group_classify(env->sd->imbalance_pct, group, sgs);
=20
+	record_sg_llc_stats(env, sgs, group);
 	/* Computing avg_load makes sense only when group is overloaded */
 	if (sgs->group_type =3D=3D group_overloaded)
 		sgs->avg_load =3D (sgs->group_load * SCHED_CAPACITY_SCALE) /
--=20
2.32.0