From nobody Fri Dec 19 21:09:58 2025 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.11]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 55E9C2EC08D for ; Wed, 3 Dec 2025 23:01:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.11 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764802886; cv=none; b=d3VrPdnnjHo1v15INzZi2Be9GCCRZHIzY8RvdjoDE/lVfQN7C6RgefM63jeAgMs+Ej4xBAgNM48bikZgcfBK97s516BGyLXX1Rbvhsn/lxdjOTLJb7/BzUSsXmqizKiXSV4Q40vVu+4KUJUTuTrw0EcRJX7axQAupxl66/Njl7g= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764802886; c=relaxed/simple; bh=/WXShEpYiDAFPDra61vUPdbNcgE+VqMlav+UUM59jU0=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=sp6UO1OW6Q3DioPA4TyMAxm2w7jZEWfXn+BecCi+DY63bhyHNOAdo2gxE9qPcZ4H/AG5K6vG0sVgNdh5TPmn2YDZ1M3oPRXJYAPeKE66XGC3smKX35V4ctG4LeLd8SIPZYPGBwl8SDEjENvTH1Cw9AGh2YoAZb6Q6CfS4bRt+vY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=pass smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=mJ6yn4qm; arc=none smtp.client-ip=198.175.65.11 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="mJ6yn4qm" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1764802884; x=1796338884; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=/WXShEpYiDAFPDra61vUPdbNcgE+VqMlav+UUM59jU0=; b=mJ6yn4qmNCzGpuUPMZdx+lsUqY/Y8q397TD5tze5hB735PCmFim3TtR3 Eh74z+kUDoOPtNaJnMct+g67IgKwnq6+WYRbc+f3oEEw9Wg1Gcg9yN7oU vI5Oubm8s7zVFVo1CwCylUT7AAgUyeA+NaPz/BoikrttCBobaJqnnubeC HmGkKxv21UFMqlb7bdh2Dv1ZUBuQd/5iPTCr2He8Z4My1BxTJHc0KlROt IrrMfarEIQ6kjL275GsASGznmrL05FEBJGY2at3hHLlbpnBR+lPPkEK0Y B/H+e/fK9u8hElcLfWPp6Axh3PPWmX2TiXZI/s6f1Be/ZF/FgJXPpYRSc Q==; X-CSE-ConnectionGUID: G2tkFvPIT6SY1+ZRXOxXBw== X-CSE-MsgGUID: uYGT49/IQCatA5I80r2gog== X-IronPort-AV: E=McAfee;i="6800,10657,11631"; a="77136249" X-IronPort-AV: E=Sophos;i="6.20,247,1758610800"; d="scan'208";a="77136249" Received: from fmviesa004.fm.intel.com ([10.60.135.144]) by orvoesa103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Dec 2025 15:01:23 -0800 X-CSE-ConnectionGUID: uiDsRffYTdabgXsA6tZowg== X-CSE-MsgGUID: ym+MS4XuQPSAYB5T1Atjpw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.20,247,1758610800"; d="scan'208";a="199763756" Received: from b04f130c83f2.jf.intel.com ([10.165.154.98]) by fmviesa004.fm.intel.com with ESMTP; 03 Dec 2025 15:01:23 -0800 From: Tim Chen To: Peter Zijlstra , Ingo Molnar , K Prateek Nayak , "Gautham R . Shenoy" , Vincent Guittot Cc: Tim Chen , Juri Lelli , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Madadi Vineeth Reddy , Hillf Danton , Shrikanth Hegde , Jianyong Wu , Yangyu Chen , Tingyin Duan , Vern Hao , Vern Hao , Len Brown , Aubrey Li , Zhao Liu , Chen Yu , Chen Yu , Adam Li , Aaron Lu , Tim Chen , linux-kernel@vger.kernel.org Subject: [PATCH v2 04/23] sched/cache: Make LLC id continuous Date: Wed, 3 Dec 2025 15:07:23 -0800 Message-Id: X-Mailer: git-send-email 2.32.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce an index mapping between CPUs and their LLCs. This provides a continuous per LLC index needed for cache-aware load balancing in later patches. The existing per_cpu llc_id usually points to the first CPU of the LLC domain, which is sparse and unsuitable as an array index. Using llc_id directly would waste memory. With the new mapping, CPUs in the same LLC share a continuous id: per_cpu(llc_id, CPU=3D0...15) =3D 0 per_cpu(llc_id, CPU=3D16...31) =3D 1 per_cpu(llc_id, CPU=3D32...47) =3D 2 ... Co-developed-by: Chen Yu Signed-off-by: Chen Yu Signed-off-by: Tim Chen --- Notes: v1->v2: Convert the static LLC id to be allocated sequentially as LLCs are discovered, and replace the old sd_llc_id. (Peter Zijlstra) kernel/sched/fair.c | 9 ++++++- kernel/sched/sched.h | 1 + kernel/sched/topology.c | 60 +++++++++++++++++++++++++++++++++++++++-- 3 files changed, 67 insertions(+), 3 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 710ed9943d27..0a3918269906 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -1210,10 +1210,17 @@ __read_mostly unsigned int llc_imb_pct = =3D 20; =20 static int llc_id(int cpu) { + int llc; + if (cpu < 0) return -1; =20 - return per_cpu(sd_llc_id, cpu); + llc =3D per_cpu(sd_llc_id, cpu); + /* avoid race with cpu hotplug */ + if (unlikely(llc >=3D max_llcs)) + return -1; + + return llc; } =20 void mm_init_sched(struct mm_struct *mm, struct mm_sched __percpu *_pcpu_s= ched) diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index bf72c5bab506..728737641847 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -2075,6 +2075,7 @@ DECLARE_PER_CPU(struct sched_domain __rcu *, sd_asym_= cpucapacity); =20 extern struct static_key_false sched_asym_cpucapacity; extern struct static_key_false sched_cluster_active; +extern int max_llcs; =20 static __always_inline bool sched_asym_cpucap_active(void) { diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c index 444bdfdab731..f25d950ab015 100644 --- a/kernel/sched/topology.c +++ b/kernel/sched/topology.c @@ -17,6 +17,8 @@ void sched_domains_mutex_unlock(void) mutex_unlock(&sched_domains_mutex); } =20 +int max_llcs; + /* Protected by sched_domains_mutex: */ static cpumask_var_t sched_domains_tmpmask; static cpumask_var_t sched_domains_tmpmask2; @@ -668,6 +670,55 @@ DEFINE_PER_CPU(struct sched_domain __rcu *, sd_asym_cp= ucapacity); DEFINE_STATIC_KEY_FALSE(sched_asym_cpucapacity); DEFINE_STATIC_KEY_FALSE(sched_cluster_active); =20 +/* + * Assign continuous llc id for the CPU, and return + * the assigned llc id. + */ +static int update_llc_id(struct sched_domain *sd, + int cpu) +{ + int id =3D per_cpu(sd_llc_id, cpu), i; + + if (id >=3D 0) + return id; + + if (sd) { + /* Look for any assigned id and reuse it.*/ + for_each_cpu(i, sched_domain_span(sd)) { + id =3D per_cpu(sd_llc_id, i); + + if (id >=3D 0) { + per_cpu(sd_llc_id, cpu) =3D id; + return id; + } + } + } + + /* + * When 1. there is no id assigned to this LLC domain, + * or 2. the sd is NULL, we reach here. + * Consider the following scenario, + * CPU0~CPU95 are in the node0, CPU96~CPU191 are + * in the node1. During bootup, maxcpus=3D96 is + * appended. + * case 1: When running cpu_attach_domain(CPU24) + * during boot up, CPU24 is the first CPU in its + * non-NULL LLC domain. However, + * its corresponding llc id has not been assigned yet. + * + * case 2: After boot up, the CPU100 is brought up + * via sysfs manually. As a result, CPU100 has only a + * Numa domain attached, because CPU100 is the only CPU + * of a sched domain, all its bottom domains are degenerated. + * The LLC domain pointer sd is NULL for CPU100. + * + * For both cases, we want to increase the number of LLCs. + */ + per_cpu(sd_llc_id, cpu) =3D max_llcs++; + + return per_cpu(sd_llc_id, cpu); +} + static void update_top_cache_domain(int cpu) { struct sched_domain_shared *sds =3D NULL; @@ -677,14 +728,13 @@ static void update_top_cache_domain(int cpu) =20 sd =3D highest_flag_domain(cpu, SD_SHARE_LLC); if (sd) { - id =3D cpumask_first(sched_domain_span(sd)); size =3D cpumask_weight(sched_domain_span(sd)); sds =3D sd->shared; } =20 rcu_assign_pointer(per_cpu(sd_llc, cpu), sd); per_cpu(sd_llc_size, cpu) =3D size; - per_cpu(sd_llc_id, cpu) =3D id; + id =3D update_llc_id(sd, cpu); rcu_assign_pointer(per_cpu(sd_llc_shared, cpu), sds); =20 sd =3D lowest_flag_domain(cpu, SD_CLUSTER); @@ -2488,6 +2538,12 @@ build_sched_domains(const struct cpumask *cpu_map, s= truct sched_domain_attr *att bool has_asym =3D false; bool has_cluster =3D false; =20 + /* first scan of LLCs */ + if (!max_llcs) { + for_each_possible_cpu(i) + per_cpu(sd_llc_id, i) =3D -1; + } + if (WARN_ON(cpumask_empty(cpu_map))) goto error; =20 --=20 2.32.0