From nobody Mon May 25 00:09:08 2026 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 62FB33DA7FB; Wed, 20 May 2026 08:34:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=193.142.43.55 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779266081; cv=none; b=a3KMRqUCU25UQlerX4wYd0pmBXNkctlonze0KU8Ec36Fi4oQ+uF7EVcl4bZ/OPJWdLtu1Ab973TCH6tr6gFQlndB/PcFv/qOwjUEjJCFXzeDmn+K8EqA3Zz6ow6zdDptZMpFKIhLJAnrqy742PedGEc39YK8bnPORX/DRinmRks= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779266081; c=relaxed/simple; bh=xyZ8Wu4i0i3z4jgiOnh50QASLcaQ0af35EK1IpWSnIg=; h=Date:From:To:Subject:Cc:In-Reply-To:References:MIME-Version: Message-ID:Content-Type; b=fubSrWx3oH7J0K+OmkpaJuRx96GLGrSO4gYxCIH+WPZQXJFV2BCFQCBRYA7/ByQanqhVoX+jlzkxNZvsfy1XAe/LN7Hn+FTRzgRu67l/EcFi9Ougulgn8+H2kQdFtq8QQuPiZPCYLUgWoJvFYPJBGzCz3drPdWSLmudr5zDvrAU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de; spf=pass smtp.mailfrom=linutronix.de; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=m6LzXBgs; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=xVg3d+OH; arc=none smtp.client-ip=193.142.43.55 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="m6LzXBgs"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="xVg3d+OH" Date: Wed, 20 May 2026 08:34:36 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1779266078; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=kMMnBrNcfV/17zNhir+XL2wvJaqQn0D6d7qqVzoG9P0=; b=m6LzXBgsRV8abN+iH3cYWSEo2kHzDeEPblw4NbAmNM1z47e+suYfPfWxCen9FliSWwQw/5 JHWXYEbGEBk6dp812/POnJ08iEFiJGe426tKpsfAkXbD2JqCVhgEUU45YZlrDANOmx41JM n6qFaOrgwS6HH9Jc/DCwWMWSI/b+FaUeDWpxg71lSBECaHHVjPkC23fuFGu5t16nH+/vj/ F2AToaep9vFlouk4imaJikKdMWlkLGF3onLcMuAUsW30p+QKnBsIdzuIdE4FVJ5IXoNzvW ygTfiw4EVd8aucapFyzMwH5kpIWnH/HHLhKZUOC/qRdVIEGdhzQfuQnkV/b/ZQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1779266078; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=kMMnBrNcfV/17zNhir+XL2wvJaqQn0D6d7qqVzoG9P0=; b=xVg3d+OHFYwAwMqUKQCm9j3iYa0hitygkJwZ1hA4WrwNGC+g52OlXxsOze9md+I9+zCVQA RUiC0snAqoPgAbDg== From: "tip-bot2 for Chen Yu" Sender: tip-bot2@linutronix.de Reply-to: linux-kernel@vger.kernel.org To: linux-tip-commits@vger.kernel.org Subject: [tip: sched/core] sched/cache: Calculate the LLC size and store it in sched_domain Cc: "Peter Zijlstra (Intel)" , Chen Yu , Tim Chen , Tingyin Duan , x86@kernel.org, linux-kernel@vger.kernel.org In-Reply-To: =?utf-8?q?=3C37afee09ff608034da0ce149e72d33b6f4698edf=2E1778703?= =?utf-8?q?694=2Egit=2Etim=2Ec=2Echen=40linux=2Eintel=2Ecom=3E?= References: =?utf-8?q?=3C37afee09ff608034da0ce149e72d33b6f4698edf=2E17787036?= =?utf-8?q?94=2Egit=2Etim=2Ec=2Echen=40linux=2Eintel=2Ecom=3E?= Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-ID: <177926607676.711.11678057428536861304.tip-bot2@tip-bot2> Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails Precedence: bulk Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable The following commit has been merged into the sched/core branch of tip: Commit-ID: 7030513a08776b2ca70fccd5dfddf7bb5c5c88ba Gitweb: https://git.kernel.org/tip/7030513a08776b2ca70fccd5dfddf7bb5= c5c88ba Author: Chen Yu AuthorDate: Wed, 13 May 2026 13:39:15 -07:00 Committer: Peter Zijlstra CommitterDate: Mon, 18 May 2026 21:33:15 +02:00 sched/cache: Calculate the LLC size and store it in sched_domain Cache aware scheduling needs to know the LLC size that a process can use, so as to avoid memory-intensive tasks from being over-aggregated on a single LLC. Introduce a preparation patch to add get_effective_llc_bytes() to get the LLC size that a CPU can use. The function can be further enhanced by subtracting the LLC cache ways reserved by resctrl (CAT in Intel RDT, etc). Suggested-by: Peter Zijlstra (Intel) Signed-off-by: Chen Yu Co-developed-by: Tim Chen Signed-off-by: Tim Chen Signed-off-by: Peter Zijlstra (Intel) Tested-by: Tingyin Duan Link: https://patch.msgid.link/37afee09ff608034da0ce149e72d33b6f4698edf.177= 8703694.git.tim.c.chen@linux.intel.com --- drivers/base/cacheinfo.c | 23 ++++++++- include/linux/cacheinfo.h | 1 +- include/linux/sched/topology.h | 7 ++- kernel/sched/topology.c | 98 +++++++++++++++++++++++++++++++-- 4 files changed, 126 insertions(+), 3 deletions(-) diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c index 391ac5e..70701d3 100644 --- a/drivers/base/cacheinfo.c +++ b/drivers/base/cacheinfo.c @@ -17,6 +17,7 @@ #include #include #include +#include #include #include #include @@ -68,6 +69,24 @@ bool last_level_cache_is_valid(unsigned int cpu) =20 } =20 +/* + * Get the cacheinfo of the LLC associated with @cpu. + * Derived from update_per_cpu_data_slice_size_cpu(). + */ +struct cacheinfo *get_cpu_cacheinfo_llc(unsigned int cpu) +{ + struct cacheinfo *llc; + + if (!last_level_cache_is_valid(cpu)) + return NULL; + + llc =3D per_cpu_cacheinfo_idx(cpu, cache_leaves(cpu) - 1); + if (llc->type !=3D CACHE_TYPE_DATA && llc->type !=3D CACHE_TYPE_UNIFIED) + return NULL; + + return llc; +} + bool last_level_cache_is_shared(unsigned int cpu_x, unsigned int cpu_y) { struct cacheinfo *llc_x, *llc_y; @@ -1018,6 +1037,7 @@ static int cacheinfo_cpu_online(unsigned int cpu) goto err; if (cpu_map_shared_cache(true, cpu, &cpu_map)) update_per_cpu_data_slice_size(true, cpu, cpu_map); + sched_update_llc_bytes(cpu); return 0; err: free_cache_attributes(cpu); @@ -1036,6 +1056,9 @@ static int cacheinfo_cpu_pre_down(unsigned int cpu) free_cache_attributes(cpu); if (nr_shared > 1) update_per_cpu_data_slice_size(false, cpu, cpu_map); + + sched_update_llc_bytes(cpu); + return 0; } =20 diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h index c8f4f0a..fc879ac 100644 --- a/include/linux/cacheinfo.h +++ b/include/linux/cacheinfo.h @@ -89,6 +89,7 @@ int populate_cache_leaves(unsigned int cpu); int cache_setup_acpi(unsigned int cpu); bool last_level_cache_is_valid(unsigned int cpu); bool last_level_cache_is_shared(unsigned int cpu_x, unsigned int cpu_y); +struct cacheinfo *get_cpu_cacheinfo_llc(unsigned int cpu); int fetch_cache_info(unsigned int cpu); int detect_cache_attributes(unsigned int cpu); #ifndef CONFIG_ACPI_PPTT diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h index 0036d6b..fe09d32 100644 --- a/include/linux/sched/topology.h +++ b/include/linux/sched/topology.h @@ -106,6 +106,7 @@ struct sched_domain { #ifdef CONFIG_SCHED_CACHE unsigned int llc_max; unsigned int *llc_counts __counted_by_ptr(llc_max); + unsigned long llc_bytes; #endif =20 #ifdef CONFIG_SCHEDSTATS @@ -265,4 +266,10 @@ static inline int task_node(const struct task_struct *= p) return cpu_to_node(task_cpu(p)); } =20 +#ifdef CONFIG_SCHED_CACHE +extern void sched_update_llc_bytes(unsigned int cpu); +#else +static inline void sched_update_llc_bytes(unsigned int cpu) { } +#endif + #endif /* _LINUX_SCHED_TOPOLOGY_H */ diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c index 9fc9934..7248a72 100644 --- a/kernel/sched/topology.c +++ b/kernel/sched/topology.c @@ -776,9 +776,11 @@ cpu_attach_domain(struct sched_domain *sd, struct root= _domain *rd, int cpu) /* move buffer to parent as child is being destroyed */ sd->llc_counts =3D tmp->llc_counts; sd->llc_max =3D tmp->llc_max; + sd->llc_bytes =3D tmp->llc_bytes; /* make sure destroy_sched_domain() does not free it */ tmp->llc_counts =3D NULL; tmp->llc_max =3D 0; + tmp->llc_bytes =3D 0; #endif /* * sched groups hold the flags of the child sched @@ -831,10 +833,42 @@ DEFINE_STATIC_KEY_FALSE(sched_cache_active); /* user wants cache aware scheduling [0 or 1] */ int sysctl_sched_cache_user =3D 1; =20 +/* + * Get the effective LLC size in bytes that @cpu's bottom sched_domain + * can use. A CPU within a cpuset partition can only use a proportion + * of the physical LLC, scaled by the ratio of the partition's span + * weight to the hardware LLC sharing weight. @sd should be the + * topmost domain with SD_SHARE_LLC. + * + * Returns 0 if cacheinfo is not yet populated. This happens during + * early boot when build_sched_domains() runs before the generic + * cacheinfo framework has been initialized (cacheinfo_cpu_online() + * is a device_initcall cpuhp callback). In that case, + * cacheinfo_cpu_online() will later call sched_update_llc_bytes() + * to fill in the bottom domain's llc_bytes once the cache attributes + * are available. + */ +static unsigned long get_effective_llc_bytes(int cpu, + struct sched_domain *sd) +{ + struct cacheinfo *ci; + unsigned int hw_weight; + + ci =3D get_cpu_cacheinfo_llc(cpu); + if (!ci) + return 0; + + hw_weight =3D cpumask_weight(&ci->shared_cpu_map); + if (!hw_weight) + return 0; + + return div_u64((u64)ci->size * sd->span_weight, hw_weight); +} + static bool alloc_sd_llc(const struct cpumask *cpu_map, struct s_data *d) { - struct sched_domain *sd; + struct sched_domain *sd, *top_llc, *parent; unsigned int *p; int i; =20 @@ -848,8 +882,24 @@ static bool alloc_sd_llc(const struct cpumask *cpu_map, if (!p) goto err; =20 - sd->llc_max =3D max_lid + 1; - sd->llc_counts =3D p; + top_llc =3D sd; + /* + * Find the topmost SD_SHARE_LLC domain. + * Not yet attached to the CPU, so per_cpu(sd_llc, i) + * can not be used. + */ + while ((parent =3D rcu_dereference_protected(top_llc->parent, true)) && + (parent->flags & SD_SHARE_LLC)) + top_llc =3D parent; + + if (top_llc->flags & SD_SHARE_LLC) { + sd->llc_max =3D max_lid + 1; + sd->llc_counts =3D p; + sd->llc_bytes =3D get_effective_llc_bytes(i, top_llc); + } else { + /* avoid memory leak */ + kfree(p); + } } =20 return true; @@ -860,6 +910,7 @@ err: kfree(sd->llc_counts); sd->llc_counts =3D NULL; sd->llc_max =3D 0; + sd->llc_bytes =3D 0; } } =20 @@ -919,6 +970,47 @@ void sched_cache_active_set_unlocked(void) { return sched_cache_active_set(false); } + +/* + * Update the bottom sched_domain's llc_bytes for @cpu and all its + * LLC siblings. Called from cacheinfo_cpu_online() or + * cacheinfo_cpu_pre_down() with cpu hotplug lock held. + * + * Note: get_effective_llc_bytes() returns 0 on PowerPC. + * thus cache aware scheduling is disabled on PowerPC for + * now. PowerPC does not use the generic cacheinfo framework -- + * it has its own cacheinfo with a separate struct cache hierarchy + * and does not populates the per-CPU struct cpu_cacheinfo array + * that get_cpu_cacheinfo_llc() reads. + */ +void sched_update_llc_bytes(unsigned int cpu) +{ + struct sched_domain *sd, *sdp; + unsigned int i; + + sched_domains_mutex_lock(); + + sdp =3D rcu_dereference_sched_domain(per_cpu(sd_llc, cpu)); + if (!sdp) + goto unlock; + + /* + * ci->shared_cpu_map is built incrementally as CPUs come + * online, so the first CPU in an LLC initially sees + * hw_weight =3D=3D 1 and computes an inflated llc_bytes in + * get_effective_llc_bytes(). Re-evaluating every LLC + * sibling on each online event corrects this once the full + * shared_cpu_map is known. + */ + for_each_cpu(i, sched_domain_span(sdp)) { + sd =3D rcu_dereference_sched_domain(cpu_rq(i)->sd); + if (sd) + sd->llc_bytes =3D get_effective_llc_bytes(i, sdp); + } + +unlock: + sched_domains_mutex_unlock(); +} #else static bool alloc_sd_llc(const struct cpumask *cpu_map, struct s_data *d)