From nobody Sun Apr 12 21:00:56 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.15]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2DC9A376BD0 for ; Wed, 1 Apr 2026 21:46:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.15 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775080011; cv=none; b=j1Sb17DSoZg+r/2f8ZebD+Rj+e0ZFuMctI1AkEEXP6ZuR60paRnSecpRrw/Zx4PeGaGuPV1/Z9qqUUGq4ObF+SZgfC5L7LLInwrUBQ8YWc2J/qyGh2NxYxYVpi8ZcpB52ZGOy/7TqjzNV7xWrLx13yKMM5QaJVWScG+snvgbf2Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775080011; c=relaxed/simple; bh=vFFWI2rNp/F4ZGB+AmlvRtOGlKSnoASi5QfRcOFHSwg=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=d9eDqRfhIVdijr27uU9lBe6a04XErXxNLzT2q5ZvZg9TAkJwNDdYq0VAfOhcrAxqS7aJSXc4VzJuyWC/cWGMFroRYZk16BOCMcmfaQFezrSl6qh/kXRbetfkFP6wAlgTY0AG9aJUoFLGj6sSAatfez67y1c0Eak7tJJNBtWSmFc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=pass smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=kiOJNsxq; arc=none smtp.client-ip=198.175.65.15 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="kiOJNsxq" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1775080010; x=1806616010; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=vFFWI2rNp/F4ZGB+AmlvRtOGlKSnoASi5QfRcOFHSwg=; b=kiOJNsxqKAwZuecuj2+b66RqQmKLO3y9dSjfi/oOC4ELjbMWUslWZieu lo7QWEglP4vCjBkA3bYxR+unoZLmAqlSgf1U6Lq1mbmrMtTMG1DaAb2HS U7f3YJL9564FzkxSwecGKl2+PHlpqpzrm5fI8xG0ZwOER+yKBD+kzWD9+ zpUzOroW9TTgDB8x3U+SiszSWVUwX+N8gy8VoiOEim1exw9lRKj64Ur2W yhUygqGjlu9nrAZGrzl5+J0OJ0CaOI1bpr1RGAn+IKGGyJbU7IaUBW+to fEvk5GgoqcR2+KqFTmASYnw+chsyfKGrtbSSQ8neevJTPaYysWxFf0nLf Q==; X-CSE-ConnectionGUID: YJDjXmw7RZ6ziP8neuK2UQ== X-CSE-MsgGUID: 7ScbDKngT+mMVxjNgObSJA== X-IronPort-AV: E=McAfee;i="6800,10657,11746"; a="79739792" X-IronPort-AV: E=Sophos;i="6.23,153,1770624000"; d="scan'208";a="79739792" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa107.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Apr 2026 14:46:50 -0700 X-CSE-ConnectionGUID: GMFXIFzJTYm36FCBvgHPRQ== X-CSE-MsgGUID: 8UGw3u55R6GJ5/J0wMxZag== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.23,153,1770624000"; d="scan'208";a="249842414" Received: from b04f130c83f2.jf.intel.com ([10.165.154.98]) by fmviesa002.fm.intel.com with ESMTP; 01 Apr 2026 14:46:49 -0700 From: Tim Chen To: Peter Zijlstra , Ingo Molnar , K Prateek Nayak , "Gautham R . Shenoy" , Vincent Guittot Cc: Tim Chen , Juri Lelli , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Madadi Vineeth Reddy , Hillf Danton , Shrikanth Hegde , Jianyong Wu , Yangyu Chen , Tingyin Duan , Vern Hao , Vern Hao , Len Brown , Aubrey Li , Zhao Liu , Chen Yu , Chen Yu , Adam Li , Aaron Lu , Tim Chen , Josh Don , Gavin Guo , Qais Yousef , Libo Chen , linux-kernel@vger.kernel.org Subject: [Patch v4 05/22] sched/cache: Make LLC id continuous Date: Wed, 1 Apr 2026 14:52:17 -0700 Message-Id: <047ef46339e4db497b54a89940a7ebedf27fcf28.1775065312.git.tim.c.chen@linux.intel.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce an index mapping between CPUs and their LLCs. This provides a roughly continuous per LLC index needed for cache-aware load balancing in later patches. The existing per_cpu llc_id usually points to the first CPU of the LLC domain, which is sparse and unsuitable as an array index. Using llc_id directly would waste memory. With the new mapping, CPUs in the same LLC share an approximate continuous id: per_cpu(llc_id, CPU=3D0...15) =3D 0 per_cpu(llc_id, CPU=3D16...31) =3D 1 per_cpu(llc_id, CPU=3D32...47) =3D 2 ... Note that the LLC IDs are allocated via bitmask, so the IDs may be reused during CPU offline->online transitions. Suggested-by: Peter Zijlstra (Intel) Originally-by: K Prateek Nayak Co-developed-by: Chen Yu Signed-off-by: Chen Yu Signed-off-by: Tim Chen --- Notes: v3->v4: Leverage dynamic cpumask management infrastructure for LLC id allocation. (K Prateek Nayak, Peter Zijlstra) kernel/sched/core.c | 2 + kernel/sched/sched.h | 3 ++ kernel/sched/topology.c | 90 ++++++++++++++++++++++++++++++++++++++++- 3 files changed, 93 insertions(+), 2 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index eff8695000e7..1188b5d24933 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -8417,6 +8417,8 @@ int sched_cpu_deactivate(unsigned int cpu) */ synchronize_rcu(); =20 + sched_domains_free_llc_id(cpu); + sched_set_rq_offline(rq, cpu); =20 scx_rq_deactivate(rq); diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 0a38bfc704a4..9defeeeb3e8e 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -4019,6 +4019,9 @@ static inline bool sched_cache_enabled(void) return false; } #endif + +void sched_domains_free_llc_id(int cpu); + extern void init_sched_mm(struct task_struct *p); =20 extern u64 avg_vruntime(struct cfs_rq *cfs_rq); diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c index 32dcddaead82..edf6d7ec73ca 100644 --- a/kernel/sched/topology.c +++ b/kernel/sched/topology.c @@ -18,8 +18,10 @@ void sched_domains_mutex_unlock(void) } =20 /* Protected by sched_domains_mutex: */ +static cpumask_var_t sched_domains_llc_id_allocmask; static cpumask_var_t sched_domains_tmpmask; static cpumask_var_t sched_domains_tmpmask2; +int max_lid; =20 static int __init sched_debug_setup(char *str) { @@ -663,7 +665,7 @@ static void destroy_sched_domains(struct sched_domain *= sd) */ DEFINE_PER_CPU(struct sched_domain __rcu *, sd_llc); DEFINE_PER_CPU(int, sd_llc_size); -DEFINE_PER_CPU(int, sd_llc_id); +DEFINE_PER_CPU(int, sd_llc_id) =3D -1; DEFINE_PER_CPU(int, sd_share_id); DEFINE_PER_CPU(struct sched_domain_shared __rcu *, sd_llc_shared); DEFINE_PER_CPU(struct sched_domain __rcu *, sd_numa); @@ -689,7 +691,6 @@ static void update_top_cache_domain(int cpu) =20 rcu_assign_pointer(per_cpu(sd_llc, cpu), sd); per_cpu(sd_llc_size, cpu) =3D size; - per_cpu(sd_llc_id, cpu) =3D id; rcu_assign_pointer(per_cpu(sd_llc_shared, cpu), sds); =20 sd =3D lowest_flag_domain(cpu, SD_CLUSTER); @@ -1776,6 +1777,11 @@ const struct cpumask *tl_mc_mask(struct sched_domain= _topology_level *tl, int cpu { return cpu_coregroup_mask(cpu); } + +#define llc_mask(cpu) cpu_coregroup_mask(cpu) + +#else +#define llc_mask(cpu) cpumask_of(cpu) #endif =20 const struct cpumask *tl_pkg_mask(struct sched_domain_topology_level *tl, = int cpu) @@ -2548,6 +2554,61 @@ static bool topology_span_sane(const struct cpumask = *cpu_map) return true; } =20 +static int __sched_domains_alloc_llc_id(void) +{ + int lid, max; + + lockdep_assert_held(&sched_domains_mutex); + + lid =3D cpumask_first_zero(sched_domains_llc_id_allocmask); + /* + * llc_id space should never grow larger than the + * possible number of CPUs in the system. + */ + if (lid >=3D nr_cpu_ids) + return -1; + + __cpumask_set_cpu(lid, sched_domains_llc_id_allocmask); + max =3D cpumask_last(sched_domains_llc_id_allocmask); + if (max > max_lid) + max_lid =3D max; + + return lid; +} + +static void __sched_domains_free_llc_id(int cpu) +{ + int i, lid, max; + + lockdep_assert_held(&sched_domains_mutex); + + lid =3D per_cpu(sd_llc_id, cpu); + if (lid =3D=3D -1 || lid >=3D nr_cpu_ids) + return; + + per_cpu(sd_llc_id, cpu) =3D -1; + + for_each_cpu(i, llc_mask(cpu)) { + /* An online CPU owns the llc_id. */ + if (per_cpu(sd_llc_id, i) =3D=3D lid) + return; + } + + __cpumask_clear_cpu(lid, sched_domains_llc_id_allocmask); + + max =3D cpumask_last(sched_domains_llc_id_allocmask); + /* shrink max lid to save memory */ + if (max < max_lid) + max_lid =3D max; +} + +void sched_domains_free_llc_id(int cpu) +{ + sched_domains_mutex_lock(); + __sched_domains_free_llc_id(cpu); + sched_domains_mutex_unlock(); +} + /* * Build sched domains for a given set of CPUs and attach the sched domains * to the individual CPUs @@ -2573,6 +2634,7 @@ build_sched_domains(const struct cpumask *cpu_map, st= ruct sched_domain_attr *att /* Set up domains for CPUs specified by the cpu_map: */ for_each_cpu(i, cpu_map) { struct sched_domain_topology_level *tl; + int lid; =20 sd =3D NULL; for_each_sd_topology(tl) { @@ -2586,6 +2648,29 @@ build_sched_domains(const struct cpumask *cpu_map, s= truct sched_domain_attr *att if (cpumask_equal(cpu_map, sched_domain_span(sd))) break; } + + lid =3D per_cpu(sd_llc_id, i); + if (lid =3D=3D -1) { + /* try to reuse the llc_id of its siblings */ + for (int j =3D cpumask_first(llc_mask(i)); + j < nr_cpu_ids; + j =3D cpumask_next(j, llc_mask(i))) { + if (i =3D=3D j) + continue; + + lid =3D per_cpu(sd_llc_id, j); + + if (lid !=3D -1) { + per_cpu(sd_llc_id, i) =3D lid; + + break; + } + } + + /* a new LLC is detected */ + if (lid =3D=3D -1) + per_cpu(sd_llc_id, i) =3D __sched_domains_alloc_llc_id(); + } } =20 if (WARN_ON(!topology_span_sane(cpu_map))) @@ -2762,6 +2847,7 @@ int __init sched_init_domains(const struct cpumask *c= pu_map) { int err; =20 + zalloc_cpumask_var(&sched_domains_llc_id_allocmask, GFP_KERNEL); zalloc_cpumask_var(&sched_domains_tmpmask, GFP_KERNEL); zalloc_cpumask_var(&sched_domains_tmpmask2, GFP_KERNEL); zalloc_cpumask_var(&fallback_doms, GFP_KERNEL); --=20 2.32.0