From nobody Thu Apr  2 16:58:15 2026
Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.9])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id CBA8E33B6F1
	for <linux-kernel@vger.kernel.org>; Tue, 10 Feb 2026 22:13:19 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
 arc=none smtp.client-ip=192.198.163.9
ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1770761601; cv=none;
 b=CyA6GLqyzy6qPy7eM4z2myjYokiNNkI64nDL55uilQdQ44nReATUU0WJy1Q5ZOQyrozMYUccXMx7nJ1/f3hIsqOmPD6D+MfrFRQTVcSUkyvkc1h82SxL9SdVu1dNh+hgkeIZ5Nz48162B6emxzem4IetLnCroJsFJiCK3VXYdbU=
ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1770761601; c=relaxed/simple;
	bh=DWu++QLbzyBJytYKRjnoJfPb+BN1oEQx/b7EvvlxTjU=;
	h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References:
	 MIME-Version;
 b=UPmDMY7DZmek4jpDEk9kbA3dy3GKc42jKcLfbDlsd7+baPsRtTertpWK9PlKbiwVW9caLZaEopZoeMv3voeMgEdvJYB/V6cwcF0Cf0dvqHUVtCgctFV5rmNGzyAY05TkBQvqsHGoKydkrmrPU23AHwd4uhX0UDmdKB9k9OPKEW4=
ARC-Authentication-Results: i=1; smtp.subspace.kernel.org;
 dmarc=pass (p=none dis=none) header.from=linux.intel.com;
 spf=pass smtp.mailfrom=linux.intel.com;
 dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com
 header.b=Ia+udd6i; arc=none smtp.client-ip=192.198.163.9
Authentication-Results: smtp.subspace.kernel.org;
 dmarc=pass (p=none dis=none) header.from=linux.intel.com
Authentication-Results: smtp.subspace.kernel.org;
 spf=pass smtp.mailfrom=linux.intel.com
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com
 header.b="Ia+udd6i"
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple;
  d=intel.com; i=@intel.com; q=dns/txt; s=Intel;
  t=1770761600; x=1802297600;
  h=from:to:cc:subject:date:message-id:in-reply-to:
   references:mime-version:content-transfer-encoding;
  bh=DWu++QLbzyBJytYKRjnoJfPb+BN1oEQx/b7EvvlxTjU=;
  b=Ia+udd6iPmC7VvGvKj9nfKB590Fn0f3t/zbTKGBt8wyYMWJteM1tvXqD
   LCkHRPnSdiZzAnOnwa3cd+cwjM/HjpP4qdbIvCETqhFYWNwyKRAgJPE42
   9j7insbiLgmu91WObHds7IyMUEFY/nUOIvSTXGdZEbVbG5dLoRNkz6pF3
   DS2yN1owhEEgMdXdGf2Vb5Q7IJf8AAVsx1bTGdCZCm4wGW2tRwuV3V9bv
   nsaWMCtJRlc6q5rjokIQLcKlxNDjmo8VSpmnO+iLC+wDXj74PsgfMZGYO
   HYT/CzSekrrP0APectfheWtmCvjTKHgYdW3neuFbbkqbJqU9JSToZHJ5w
   Q==;
X-CSE-ConnectionGUID: MvzCwiLATdCb3BjfRNzGdQ==
X-CSE-MsgGUID: rMJBZNE1QpeS/NodwVY+CQ==
X-IronPort-AV: E=McAfee;i="6800,10657,11697"; a="82631378"
X-IronPort-AV: E=Sophos;i="6.21,283,1763452800";
   d="scan'208";a="82631378"
Received: from fmviesa004.fm.intel.com ([10.60.135.144])
  by fmvoesa103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 10 Feb 2026 14:13:20 -0800
X-CSE-ConnectionGUID: +kW0xOefQL+p/3/t8mxyqg==
X-CSE-MsgGUID: HxX5F/csQZSuBNIc0rmErw==
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="6.21,283,1763452800";
   d="scan'208";a="216373939"
Received: from b04f130c83f2.jf.intel.com ([10.165.154.98])
  by fmviesa004.fm.intel.com with ESMTP; 10 Feb 2026 14:13:18 -0800
From: Tim Chen <tim.c.chen@linux.intel.com>
To: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	K Prateek Nayak <kprateek.nayak@amd.com>,
	"Gautham R . Shenoy" <gautham.shenoy@amd.com>,
	Vincent Guittot <vincent.guittot@linaro.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>,
	Juri Lelli <juri.lelli@redhat.com>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>,
	Mel Gorman <mgorman@suse.de>,
	Valentin Schneider <vschneid@redhat.com>,
	Madadi Vineeth Reddy <vineethr@linux.ibm.com>,
	Hillf Danton <hdanton@sina.com>,
	Shrikanth Hegde <sshegde@linux.ibm.com>,
	Jianyong Wu <jianyong.wu@outlook.com>,
	Yangyu Chen <cyy@cyyself.name>,
	Tingyin Duan <tingyin.duan@gmail.com>,
	Vern Hao <vernhao@tencent.com>,
	Vern Hao <haoxing990@gmail.com>,
	Len Brown <len.brown@intel.com>,
	Aubrey Li <aubrey.li@intel.com>,
	Zhao Liu <zhao1.liu@intel.com>,
	Chen Yu <yu.chen.surf@gmail.com>,
	Chen Yu <yu.c.chen@intel.com>,
	Adam Li <adamli@os.amperecomputing.com>,
	Aaron Lu <ziqianlu@bytedance.com>,
	Tim Chen <tim.c.chen@intel.com>,
	Josh Don <joshdon@google.com>,
	Gavin Guo <gavinguo@igalia.com>,
	Qais Yousef <qyousef@layalina.io>,
	Libo Chen <libchen@purestorage.com>,
	linux-kernel@vger.kernel.org
Subject: [PATCH v3 08/21] sched/cache: Calculate the percpu sd task LLC
 preference
Date: Tue, 10 Feb 2026 14:18:48 -0800
Message-Id: 
 <41f8e91b70060e7697840163b80c3dc097aabb34.1770760558.git.tim.c.chen@linux.intel.com>
X-Mailer: git-send-email 2.32.0
In-Reply-To: <cover.1770760558.git.tim.c.chen@linux.intel.com>
References: <cover.1770760558.git.tim.c.chen@linux.intel.com>
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="utf-8"

Calculate the number of tasks' LLC preferences for each runqueue.
This statistic is computed during task enqueue and dequeue
operations, and is used by the cache-aware load balancing.

Co-developed-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
---

Notes:
    v2->v3: Move max_llcs check from patch4 to this patch.
    This would clarify the rationale for the
    max_llc check and makes review easier (Peter Zijlstra).

 kernel/sched/fair.c | 56 +++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 54 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 6ad9ad2f918f..4a98aa866d65 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1199,28 +1199,80 @@ static int llc_id(int cpu)
 	return per_cpu(sd_llc_id, cpu);
 }
=20
+static inline bool valid_llc_id(int id)
+{
+	if (unlikely(id < 0 || id >=3D max_llcs))
+		return false;
+
+	return true;
+}
+
+static inline bool valid_llc_buf(struct sched_domain *sd,
+				 int id)
+{
+	/*
+	 * The check for sd and its corresponding pf is to
+	 * confirm that the sd->pf[] has been allocated in
+	 * build_sched_domains() after the assignment of
+	 * per_cpu(sd_llc_id, i). This is used to avoid
+	 * the race condition.
+	 */
+	if (unlikely(!sd || !sd->pf))
+		return false;
+
+	return valid_llc_id(id);
+}
+
 static void account_llc_enqueue(struct rq *rq, struct task_struct *p)
 {
+	struct sched_domain *sd;
 	int pref_llc;
=20
 	pref_llc =3D p->preferred_llc;
-	if (pref_llc < 0)
+	if (!valid_llc_id(pref_llc))
 		return;
=20
 	rq->nr_llc_running++;
 	rq->nr_pref_llc_running +=3D (pref_llc =3D=3D task_llc(p));
+
+	scoped_guard (rcu) {
+		sd =3D rcu_dereference(rq->sd);
+		if (valid_llc_buf(sd, pref_llc))
+			sd->pf[pref_llc]++;
+	}
 }
=20
 static void account_llc_dequeue(struct rq *rq, struct task_struct *p)
 {
+	struct sched_domain *sd;
 	int pref_llc;
=20
 	pref_llc =3D p->preferred_llc;
-	if (pref_llc < 0)
+	if (!valid_llc_id(pref_llc))
 		return;
=20
 	rq->nr_llc_running--;
 	rq->nr_pref_llc_running -=3D (pref_llc =3D=3D task_llc(p));
+
+	scoped_guard (rcu) {
+		sd =3D rcu_dereference(rq->sd);
+		if (valid_llc_buf(sd, pref_llc)) {
+			/*
+			 * There is a race condition between dequeue
+			 * and CPU hotplug. After a task has been enqueued
+			 * on CPUx, a CPU hotplug event occurs, and all online
+			 * CPUs (including CPUx) rebuild their sched_domains
+			 * and reset statistics to zero (including sd->pf).
+			 * This can cause temporary undercount and we have to
+			 * check for such underflow in sd->pf.
+			 *
+			 * This undercount is temporary and accurate accounting
+			 * will resume once the rq has a chance to be idle.
+			 */
+			if (sd->pf[pref_llc])
+				sd->pf[pref_llc]--;
+		}
+	}
 }
=20
 void mm_init_sched(struct mm_struct *mm,
--=20
2.32.0