From nobody Sun Apr 12 21:02:22 2026
Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.15])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id C1CB5339872
	for <linux-kernel@vger.kernel.org>; Wed,  1 Apr 2026 21:46:56 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
 arc=none smtp.client-ip=198.175.65.15
ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1775080018; cv=none;
 b=NI+mDp+WTEq63DtC1cQuxueXwCJlSzlm/oPpDa6es5ol/RKoobZQu+xy+BffDDpWueR8p7jxZzo5k13bunl4AthnegaEnbrZ/ZhDGGFdKX8NMtbE19y9TS3h0vZahHPsvgqr6h3UnfEc9nZ/7J41jm0hauP3Q29Snq8U2ROUHgs=
ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1775080018; c=relaxed/simple;
	bh=8tuOAyRbCgTWSQh8afC5kV/7brREb9W72a0pHljBBko=;
	h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References:
	 MIME-Version;
 b=Vd0qT2nYwuZatileqKY5kHiRGjUfMakO6ifD2aFv0ElceTuVU9X80xhLPxDa2gChrqhmunGvtjTMOJ5/ffBhAnfyy66HvaBOcbeawCg/IDFXjDNVOiUCNs8LrIdUvM0lLP6NoPYSrzoRxhrlu0SMEoVlSQpqd2XdhCEJsguxAKY=
ARC-Authentication-Results: i=1; smtp.subspace.kernel.org;
 dmarc=pass (p=none dis=none) header.from=linux.intel.com;
 spf=pass smtp.mailfrom=linux.intel.com;
 dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com
 header.b=BSQ4NPba; arc=none smtp.client-ip=198.175.65.15
Authentication-Results: smtp.subspace.kernel.org;
 dmarc=pass (p=none dis=none) header.from=linux.intel.com
Authentication-Results: smtp.subspace.kernel.org;
 spf=pass smtp.mailfrom=linux.intel.com
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com
 header.b="BSQ4NPba"
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple;
  d=intel.com; i=@intel.com; q=dns/txt; s=Intel;
  t=1775080017; x=1806616017;
  h=from:to:cc:subject:date:message-id:in-reply-to:
   references:mime-version:content-transfer-encoding;
  bh=8tuOAyRbCgTWSQh8afC5kV/7brREb9W72a0pHljBBko=;
  b=BSQ4NPbagadi/0kZ6SKS6OWVyApoUuOdUek3kSbyFEvvNth5934u6W3X
   Njf0M1+viT8fErE5svMyPFUfXIAQK654OMfV0ip4F0a0/z8zS/NdrjS5v
   IqZ/twyMEMTmveygbjpttEJBTfeRcWZ4TV1wxND6fSMH0Wt9zyi2f5gOs
   Culm3JZUl0N1Ww+tJj7xRXjYtIdYVCHWTXOmDUSLgs6PRXqKFIQ6VU6ok
   FGp/rJdTliQmCnJV6T5lwvwiP3sKDU8iwfN98lUFiFEJPkmlf89PknVO8
   To1b1AykdHyAuh/34sJjSW9JoIoqKID8oi8VpDjeaDqoj+ajCMsLHvF9R
   g==;
X-CSE-ConnectionGUID: Ug9F0h5TQXSAkVwoRv5jNg==
X-CSE-MsgGUID: fi+Q/e9DSuu3WYeu3kq2BQ==
X-IronPort-AV: E=McAfee;i="6800,10657,11746"; a="79739879"
X-IronPort-AV: E=Sophos;i="6.23,153,1770624000";
   d="scan'208";a="79739879"
Received: from fmviesa002.fm.intel.com ([10.60.135.142])
  by orvoesa107.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 01 Apr 2026 14:46:57 -0700
X-CSE-ConnectionGUID: JOhAnu2nRgusV+LnItC5kg==
X-CSE-MsgGUID: jQNY9A89QfeICVHoac41tg==
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="6.23,153,1770624000";
   d="scan'208";a="249842435"
Received: from b04f130c83f2.jf.intel.com ([10.165.154.98])
  by fmviesa002.fm.intel.com with ESMTP; 01 Apr 2026 14:46:55 -0700
From: Tim Chen <tim.c.chen@linux.intel.com>
To: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	K Prateek Nayak <kprateek.nayak@amd.com>,
	"Gautham R . Shenoy" <gautham.shenoy@amd.com>,
	Vincent Guittot <vincent.guittot@linaro.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>,
	Juri Lelli <juri.lelli@redhat.com>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>,
	Mel Gorman <mgorman@suse.de>,
	Valentin Schneider <vschneid@redhat.com>,
	Madadi Vineeth Reddy <vineethr@linux.ibm.com>,
	Hillf Danton <hdanton@sina.com>,
	Shrikanth Hegde <sshegde@linux.ibm.com>,
	Jianyong Wu <jianyong.wu@outlook.com>,
	Yangyu Chen <cyy@cyyself.name>,
	Tingyin Duan <tingyin.duan@gmail.com>,
	Vern Hao <vernhao@tencent.com>,
	Vern Hao <haoxing990@gmail.com>,
	Len Brown <len.brown@intel.com>,
	Aubrey Li <aubrey.li@intel.com>,
	Zhao Liu <zhao1.liu@intel.com>,
	Chen Yu <yu.chen.surf@gmail.com>,
	Chen Yu <yu.c.chen@intel.com>,
	Adam Li <adamli@os.amperecomputing.com>,
	Aaron Lu <ziqianlu@bytedance.com>,
	Tim Chen <tim.c.chen@intel.com>,
	Josh Don <joshdon@google.com>,
	Gavin Guo <gavinguo@igalia.com>,
	Qais Yousef <qyousef@layalina.io>,
	Libo Chen <libchen@purestorage.com>,
	linux-kernel@vger.kernel.org
Subject: [Patch v4 09/22] sched/cache: Calculate the percpu sd task LLC
 preference
Date: Wed,  1 Apr 2026 14:52:21 -0700
Message-Id: 
 <d15a64436d3acd19c5c53344c5e9d3d0b79b3233.1775065312.git.tim.c.chen@linux.intel.com>
X-Mailer: git-send-email 2.32.0
In-Reply-To: <cover.1775065312.git.tim.c.chen@linux.intel.com>
References: <cover.1775065312.git.tim.c.chen@linux.intel.com>
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="utf-8"

Calculate the number of tasks' LLC preferences for each runqueue.
This statistic is computed during task enqueue and dequeue
operations, and is used by the cache-aware load balancing.

Co-developed-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
---

Notes:
    v3->v4:
        Remove unnecessary rcu_read_lock() in eq/dq as rq lock
        is held. Use rcu_dereference_all() directly.
        (Peter Zijlstra)

 kernel/sched/fair.c | 49 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 49 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 4b760bd604e7..e6474e61f4aa 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1291,8 +1291,34 @@ static int llc_id(int cpu)
 	return per_cpu(sd_llc_id, cpu);
 }
=20
+static inline bool valid_llc_buf(struct sched_domain *sd,
+				 int id)
+{
+	/*
+	 * These checks are used to avoid the following
+	 * race causing out-of-range access to llc_counts:
+	 *
+	 * CPU0                                CPU1
+	 * :
+	 * ...
+	 * build_sched_domains          update_sg_lb_stats
+	 *                                for_each_cpu_and(i, sg)
+	 *                                  sd=3Drq[i]->sd
+	 *   per_cpu(sd_llc_id,i)=3Dnew_llc
+	 *                                  llc=3Dllc_id(i)
+	 *                                  !!!sd->llc_counts[llc]!!!
+	 *   sd->llc_counts=3Dkzalloc()
+	 *   sd->llc_max=3Dmax_llc
+	 */
+	if (unlikely(id < 0 || !sd || !sd->llc_counts || id > sd->llc_max))
+		return false;
+
+	return true;
+}
+
 static void account_llc_enqueue(struct rq *rq, struct task_struct *p)
 {
+	struct sched_domain *sd;
 	int pref_llc;
=20
 	pref_llc =3D p->preferred_llc;
@@ -1301,10 +1327,15 @@ static void account_llc_enqueue(struct rq *rq, stru=
ct task_struct *p)
=20
 	rq->nr_llc_running++;
 	rq->nr_pref_llc_running +=3D (pref_llc =3D=3D task_llc(p));
+
+	sd =3D rcu_dereference_all(rq->sd);
+	if (valid_llc_buf(sd, pref_llc))
+		sd->llc_counts[pref_llc]++;
 }
=20
 static void account_llc_dequeue(struct rq *rq, struct task_struct *p)
 {
+	struct sched_domain *sd;
 	int pref_llc;
=20
 	pref_llc =3D p->preferred_llc;
@@ -1313,6 +1344,24 @@ static void account_llc_dequeue(struct rq *rq, struc=
t task_struct *p)
=20
 	rq->nr_llc_running--;
 	rq->nr_pref_llc_running -=3D (pref_llc =3D=3D task_llc(p));
+
+	sd =3D rcu_dereference_all(rq->sd);
+	if (valid_llc_buf(sd, pref_llc)) {
+		/*
+		 * There is a race condition between dequeue
+		 * and CPU hotplug. After a task has been enqueued
+		 * on CPUx, a CPU hotplug event occurs, and all online
+		 * CPUs (including CPUx) rebuild their sched_domains
+		 * and reset statistics to zero(including sd->llc_counts).
+		 * This can cause temporary undercount and we have to
+		 * check for such underflow in sd->llc_counts.
+		 *
+		 * This undercount is temporary and accurate accounting
+		 * will resume once the rq has a chance to be idle.
+		 */
+		if (sd->llc_counts[pref_llc])
+			sd->llc_counts[pref_llc]--;
+	}
 }
=20
 void mm_init_sched(struct mm_struct *mm,
--=20
2.32.0