From nobody Sun Jun 14 19:13:01 2026
Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id 62FB33DA7FB;
	Wed, 20 May 2026 08:34:39 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
 arc=none smtp.client-ip=193.142.43.55
ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1779266081; cv=none;
 b=a3KMRqUCU25UQlerX4wYd0pmBXNkctlonze0KU8Ec36Fi4oQ+uF7EVcl4bZ/OPJWdLtu1Ab973TCH6tr6gFQlndB/PcFv/qOwjUEjJCFXzeDmn+K8EqA3Zz6ow6zdDptZMpFKIhLJAnrqy742PedGEc39YK8bnPORX/DRinmRks=
ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1779266081; c=relaxed/simple;
	bh=xyZ8Wu4i0i3z4jgiOnh50QASLcaQ0af35EK1IpWSnIg=;
	h=Date:From:To:Subject:Cc:In-Reply-To:References:MIME-Version:
	 Message-ID:Content-Type;
 b=fubSrWx3oH7J0K+OmkpaJuRx96GLGrSO4gYxCIH+WPZQXJFV2BCFQCBRYA7/ByQanqhVoX+jlzkxNZvsfy1XAe/LN7Hn+FTRzgRu67l/EcFi9Ougulgn8+H2kQdFtq8QQuPiZPCYLUgWoJvFYPJBGzCz3drPdWSLmudr5zDvrAU=
ARC-Authentication-Results: i=1; smtp.subspace.kernel.org;
 dmarc=pass (p=none dis=none) header.from=linutronix.de;
 spf=pass smtp.mailfrom=linutronix.de;
 dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de
 header.b=m6LzXBgs;
 dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de
 header.b=xVg3d+OH; arc=none smtp.client-ip=193.142.43.55
Authentication-Results: smtp.subspace.kernel.org;
 dmarc=pass (p=none dis=none) header.from=linutronix.de
Authentication-Results: smtp.subspace.kernel.org;
 spf=pass smtp.mailfrom=linutronix.de
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de
 header.b="m6LzXBgs";
	dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de
 header.b="xVg3d+OH"
Date: Wed, 20 May 2026 08:34:36 -0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de;
	s=2020; t=1779266078;
	h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-type:
	 content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references;
	bh=kMMnBrNcfV/17zNhir+XL2wvJaqQn0D6d7qqVzoG9P0=;
	b=m6LzXBgsRV8abN+iH3cYWSEo2kHzDeEPblw4NbAmNM1z47e+suYfPfWxCen9FliSWwQw/5
	JHWXYEbGEBk6dp812/POnJ08iEFiJGe426tKpsfAkXbD2JqCVhgEUU45YZlrDANOmx41JM
	n6qFaOrgwS6HH9Jc/DCwWMWSI/b+FaUeDWpxg71lSBECaHHVjPkC23fuFGu5t16nH+/vj/
	F2AToaep9vFlouk4imaJikKdMWlkLGF3onLcMuAUsW30p+QKnBsIdzuIdE4FVJ5IXoNzvW
	ygTfiw4EVd8aucapFyzMwH5kpIWnH/HHLhKZUOC/qRdVIEGdhzQfuQnkV/b/ZQ==
DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de;
	s=2020e; t=1779266078;
	h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-type:
	 content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references;
	bh=kMMnBrNcfV/17zNhir+XL2wvJaqQn0D6d7qqVzoG9P0=;
	b=xVg3d+OHFYwAwMqUKQCm9j3iYa0hitygkJwZ1hA4WrwNGC+g52OlXxsOze9md+I9+zCVQA
	RUiC0snAqoPgAbDg==
From: "tip-bot2 for Chen Yu" <tip-bot2@linutronix.de>
Sender: tip-bot2@linutronix.de
Reply-to: linux-kernel@vger.kernel.org
To: linux-tip-commits@vger.kernel.org
Subject: [tip: sched/core] sched/cache: Calculate the LLC size and store it in
 sched_domain
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>,
 Chen Yu <yu.c.chen@intel.com>, Tim Chen <tim.c.chen@linux.intel.com>,
 Tingyin Duan <tingyin.duan@gmail.com>, x86@kernel.org,
 linux-kernel@vger.kernel.org
In-Reply-To: =?utf-8?q?=3C37afee09ff608034da0ce149e72d33b6f4698edf=2E1778703?=
 =?utf-8?q?694=2Egit=2Etim=2Ec=2Echen=40linux=2Eintel=2Ecom=3E?=
References: =?utf-8?q?=3C37afee09ff608034da0ce149e72d33b6f4698edf=2E17787036?=
 =?utf-8?q?94=2Egit=2Etim=2Ec=2Echen=40linux=2Eintel=2Ecom=3E?=
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
Message-ID: <177926607676.711.11678057428536861304.tip-bot2@tip-bot2>
Robot-ID: <tip-bot2@linutronix.de>
Robot-Unsubscribe: 
 Contact <mailto:tglx@kernel.org> to get blacklisted from these emails
Precedence: bulk
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     7030513a08776b2ca70fccd5dfddf7bb5c5c88ba
Gitweb:        https://git.kernel.org/tip/7030513a08776b2ca70fccd5dfddf7bb5=
c5c88ba
Author:        Chen Yu <yu.c.chen@intel.com>
AuthorDate:    Wed, 13 May 2026 13:39:15 -07:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Mon, 18 May 2026 21:33:15 +02:00

sched/cache: Calculate the LLC size and store it in sched_domain

Cache aware scheduling needs to know the LLC size that a process
can use, so as to avoid memory-intensive tasks from being
over-aggregated on a single LLC.

Introduce a preparation patch to add get_effective_llc_bytes() to
get the LLC size that a CPU can use. The function can be further
enhanced by subtracting the LLC cache ways reserved by resctrl
(CAT in Intel RDT, etc).

Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Co-developed-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Tingyin Duan <tingyin.duan@gmail.com>
Link: https://patch.msgid.link/37afee09ff608034da0ce149e72d33b6f4698edf.177=
8703694.git.tim.c.chen@linux.intel.com
---
 drivers/base/cacheinfo.c       | 23 ++++++++-
 include/linux/cacheinfo.h      |  1 +-
 include/linux/sched/topology.h |  7 ++-
 kernel/sched/topology.c        | 98 +++++++++++++++++++++++++++++++--
 4 files changed, 126 insertions(+), 3 deletions(-)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index 391ac5e..70701d3 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -17,6 +17,7 @@
 #include <linux/init.h>
 #include <linux/of.h>
 #include <linux/sched.h>
+#include <linux/sched/topology.h>
 #include <linux/slab.h>
 #include <linux/smp.h>
 #include <linux/sysfs.h>
@@ -68,6 +69,24 @@ bool last_level_cache_is_valid(unsigned int cpu)
=20
 }
=20
+/*
+ * Get the cacheinfo of the LLC associated with @cpu.
+ * Derived from update_per_cpu_data_slice_size_cpu().
+ */
+struct cacheinfo *get_cpu_cacheinfo_llc(unsigned int cpu)
+{
+	struct cacheinfo *llc;
+
+	if (!last_level_cache_is_valid(cpu))
+		return NULL;
+
+	llc =3D per_cpu_cacheinfo_idx(cpu, cache_leaves(cpu) - 1);
+	if (llc->type !=3D CACHE_TYPE_DATA && llc->type !=3D CACHE_TYPE_UNIFIED)
+		return NULL;
+
+	return llc;
+}
+
 bool last_level_cache_is_shared(unsigned int cpu_x, unsigned int cpu_y)
 {
 	struct cacheinfo *llc_x, *llc_y;
@@ -1018,6 +1037,7 @@ static int cacheinfo_cpu_online(unsigned int cpu)
 		goto err;
 	if (cpu_map_shared_cache(true, cpu, &cpu_map))
 		update_per_cpu_data_slice_size(true, cpu, cpu_map);
+	sched_update_llc_bytes(cpu);
 	return 0;
 err:
 	free_cache_attributes(cpu);
@@ -1036,6 +1056,9 @@ static int cacheinfo_cpu_pre_down(unsigned int cpu)
 	free_cache_attributes(cpu);
 	if (nr_shared > 1)
 		update_per_cpu_data_slice_size(false, cpu, cpu_map);
+
+	sched_update_llc_bytes(cpu);
+
 	return 0;
 }
=20
diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
index c8f4f0a..fc879ac 100644
--- a/include/linux/cacheinfo.h
+++ b/include/linux/cacheinfo.h
@@ -89,6 +89,7 @@ int populate_cache_leaves(unsigned int cpu);
 int cache_setup_acpi(unsigned int cpu);
 bool last_level_cache_is_valid(unsigned int cpu);
 bool last_level_cache_is_shared(unsigned int cpu_x, unsigned int cpu_y);
+struct cacheinfo *get_cpu_cacheinfo_llc(unsigned int cpu);
 int fetch_cache_info(unsigned int cpu);
 int detect_cache_attributes(unsigned int cpu);
 #ifndef CONFIG_ACPI_PPTT
diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h
index 0036d6b..fe09d32 100644
--- a/include/linux/sched/topology.h
+++ b/include/linux/sched/topology.h
@@ -106,6 +106,7 @@ struct sched_domain {
 #ifdef CONFIG_SCHED_CACHE
 	unsigned int llc_max;
 	unsigned int *llc_counts __counted_by_ptr(llc_max);
+	unsigned long llc_bytes;
 #endif
=20
 #ifdef CONFIG_SCHEDSTATS
@@ -265,4 +266,10 @@ static inline int task_node(const struct task_struct *=
p)
 	return cpu_to_node(task_cpu(p));
 }
=20
+#ifdef CONFIG_SCHED_CACHE
+extern void sched_update_llc_bytes(unsigned int cpu);
+#else
+static inline void sched_update_llc_bytes(unsigned int cpu) { }
+#endif
+
 #endif /* _LINUX_SCHED_TOPOLOGY_H */
diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index 9fc9934..7248a72 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -776,9 +776,11 @@ cpu_attach_domain(struct sched_domain *sd, struct root=
_domain *rd, int cpu)
 			/* move buffer to parent as child is being destroyed */
 			sd->llc_counts =3D tmp->llc_counts;
 			sd->llc_max =3D tmp->llc_max;
+			sd->llc_bytes =3D tmp->llc_bytes;
 			/* make sure destroy_sched_domain() does not free it */
 			tmp->llc_counts =3D NULL;
 			tmp->llc_max =3D 0;
+			tmp->llc_bytes =3D 0;
 #endif
 			/*
 			 * sched groups hold the flags of the child sched
@@ -831,10 +833,42 @@ DEFINE_STATIC_KEY_FALSE(sched_cache_active);
 /* user wants cache aware scheduling [0 or 1] */
 int sysctl_sched_cache_user =3D 1;
=20
+/*
+ * Get the effective LLC size in bytes that @cpu's bottom sched_domain
+ * can use. A CPU within a cpuset partition can only use a proportion
+ * of the physical LLC, scaled by the ratio of the partition's span
+ * weight to the hardware LLC sharing weight. @sd should be the
+ * topmost domain with SD_SHARE_LLC.
+ *
+ * Returns 0 if cacheinfo is not yet populated. This happens during
+ * early boot when build_sched_domains() runs before the generic
+ * cacheinfo framework has been initialized (cacheinfo_cpu_online()
+ * is a device_initcall cpuhp callback). In that case,
+ * cacheinfo_cpu_online() will later call sched_update_llc_bytes()
+ * to fill in the bottom domain's llc_bytes once the cache attributes
+ * are available.
+ */
+static unsigned long get_effective_llc_bytes(int cpu,
+					     struct sched_domain *sd)
+{
+	struct cacheinfo *ci;
+	unsigned int hw_weight;
+
+	ci =3D get_cpu_cacheinfo_llc(cpu);
+	if (!ci)
+		return 0;
+
+	hw_weight =3D cpumask_weight(&ci->shared_cpu_map);
+	if (!hw_weight)
+		return 0;
+
+	return div_u64((u64)ci->size * sd->span_weight, hw_weight);
+}
+
 static bool alloc_sd_llc(const struct cpumask *cpu_map,
 			 struct s_data *d)
 {
-	struct sched_domain *sd;
+	struct sched_domain *sd, *top_llc, *parent;
 	unsigned int *p;
 	int i;
=20
@@ -848,8 +882,24 @@ static bool alloc_sd_llc(const struct cpumask *cpu_map,
 		if (!p)
 			goto err;
=20
-		sd->llc_max =3D max_lid + 1;
-		sd->llc_counts =3D p;
+		top_llc =3D sd;
+		/*
+		 * Find the topmost SD_SHARE_LLC domain.
+		 * Not yet attached to the CPU, so per_cpu(sd_llc, i)
+		 * can not be used.
+		 */
+		while ((parent =3D rcu_dereference_protected(top_llc->parent, true)) &&
+		       (parent->flags & SD_SHARE_LLC))
+			top_llc =3D parent;
+
+		if (top_llc->flags & SD_SHARE_LLC) {
+			sd->llc_max =3D max_lid + 1;
+			sd->llc_counts =3D p;
+			sd->llc_bytes =3D get_effective_llc_bytes(i, top_llc);
+		} else {
+			/* avoid memory leak */
+			kfree(p);
+		}
 	}
=20
 	return true;
@@ -860,6 +910,7 @@ err:
 			kfree(sd->llc_counts);
 			sd->llc_counts =3D NULL;
 			sd->llc_max =3D 0;
+			sd->llc_bytes =3D 0;
 		}
 	}
=20
@@ -919,6 +970,47 @@ void sched_cache_active_set_unlocked(void)
 {
 	return sched_cache_active_set(false);
 }
+
+/*
+ * Update the bottom sched_domain's llc_bytes for @cpu and all its
+ * LLC siblings. Called from cacheinfo_cpu_online() or
+ * cacheinfo_cpu_pre_down() with cpu hotplug lock held.
+ *
+ * Note: get_effective_llc_bytes() returns 0 on PowerPC.
+ * thus cache aware scheduling is disabled on PowerPC for
+ * now. PowerPC does not use the generic cacheinfo framework --
+ * it has its own cacheinfo with a separate struct cache hierarchy
+ * and does not populates the per-CPU struct cpu_cacheinfo array
+ * that get_cpu_cacheinfo_llc() reads.
+ */
+void sched_update_llc_bytes(unsigned int cpu)
+{
+	struct sched_domain *sd, *sdp;
+	unsigned int i;
+
+	sched_domains_mutex_lock();
+
+	sdp =3D rcu_dereference_sched_domain(per_cpu(sd_llc, cpu));
+	if (!sdp)
+		goto unlock;
+
+	/*
+	 * ci->shared_cpu_map is built incrementally as CPUs come
+	 * online, so the first CPU in an LLC initially sees
+	 * hw_weight =3D=3D 1 and computes an inflated llc_bytes in
+	 * get_effective_llc_bytes().  Re-evaluating every LLC
+	 * sibling on each online event corrects this once the full
+	 * shared_cpu_map is known.
+	 */
+	for_each_cpu(i, sched_domain_span(sdp)) {
+		sd =3D rcu_dereference_sched_domain(cpu_rq(i)->sd);
+		if (sd)
+			sd->llc_bytes =3D get_effective_llc_bytes(i, sdp);
+	}
+
+unlock:
+	sched_domains_mutex_unlock();
+}
 #else
 static bool alloc_sd_llc(const struct cpumask *cpu_map,
 			 struct s_data *d)