From nobody Sat Feb  7 10:58:05 2026
Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.11])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id C93962EF652
	for <linux-kernel@vger.kernel.org>; Wed,  3 Dec 2025 23:01:27 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
 arc=none smtp.client-ip=198.175.65.11
ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1764802890; cv=none;
 b=EDsiu7g2BtXvvoS9BKwrirW/B8ldDhmwGPx+cdJzoxBtklhxCuicf7XZFi+5IO9eicj+U0q988drhlH0OJjM+IwUt0amTGbw3mfM6d+6WZDelOH8Kc3PIbWBuITzHpbg31UVRdkj3UEviuqp+uvpMTrssPknIugATiCNu3Bm+08=
ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1764802890; c=relaxed/simple;
	bh=VfUUqC84e+k4dM9OCiHr0qSll3wkyw96Z2hiwhlrd+g=;
	h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References:
	 MIME-Version;
 b=bMgyCWF3/XMpBtns9xgAQbuvJYsQoxOLy5qU1v3Ure2zyH7eaHG4ZLbKyqgBn1NINjkU2O0RPcPn7whkPdiyLRm36oluEWQ4viCDhC3YxOj/EZYMjqKw4E92UmhMBk5j0NYcW2RvXkMIEQxCZjUg4qUDiMfwP1eraXWWdJgmvkk=
ARC-Authentication-Results: i=1; smtp.subspace.kernel.org;
 dmarc=pass (p=none dis=none) header.from=linux.intel.com;
 spf=pass smtp.mailfrom=linux.intel.com;
 dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com
 header.b=HkSBtET5; arc=none smtp.client-ip=198.175.65.11
Authentication-Results: smtp.subspace.kernel.org;
 dmarc=pass (p=none dis=none) header.from=linux.intel.com
Authentication-Results: smtp.subspace.kernel.org;
 spf=pass smtp.mailfrom=linux.intel.com
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com
 header.b="HkSBtET5"
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple;
  d=intel.com; i=@intel.com; q=dns/txt; s=Intel;
  t=1764802887; x=1796338887;
  h=from:to:cc:subject:date:message-id:in-reply-to:
   references:mime-version:content-transfer-encoding;
  bh=VfUUqC84e+k4dM9OCiHr0qSll3wkyw96Z2hiwhlrd+g=;
  b=HkSBtET5tJuyrYVLfwF6tgrJB3jPRTx01PEveXBF1wIqsiOJxkXhzAOm
   sC1smgQCW8wgJyR4E1u9VSEyU2s5OeGIeEuC988/p/oKmWX8sR4t5I1+Q
   tI0jgAIHPovP+AIphgRpysIDP7uveWJciGMii/zPUANlnHxP4W7VRq2eJ
   sBFqpGeZy1Ve8fewNRoxQswiP1fA+sTe9iwHVjtYcP+1v4kzgt4NxJNt7
   wXwMA6vcMf7L8X5pDnsHkNo+K4j1B34n8SEcNJu9+4em9z3ghkY3MGzod
   zaVcGH6lY2mH/znHiuVlkKaau6etkJB5XXnU6Zdt6/ZSkCkDGyN6SoMYU
   A==;
X-CSE-ConnectionGUID: k1qC8aFmROqogX9M8c8KUg==
X-CSE-MsgGUID: rOmGsn1SSNWPs0ITFZygdw==
X-IronPort-AV: E=McAfee;i="6800,10657,11631"; a="77136288"
X-IronPort-AV: E=Sophos;i="6.20,247,1758610800";
   d="scan'208";a="77136288"
Received: from fmviesa004.fm.intel.com ([10.60.135.144])
  by orvoesa103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 03 Dec 2025 15:01:27 -0800
X-CSE-ConnectionGUID: cvzDuwr8RH6q0DcIt04zOw==
X-CSE-MsgGUID: uxGA3PlMTN6liJURHzFh/g==
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="6.20,247,1758610800";
   d="scan'208";a="199763775"
Received: from b04f130c83f2.jf.intel.com ([10.165.154.98])
  by fmviesa004.fm.intel.com with ESMTP; 03 Dec 2025 15:01:26 -0800
From: Tim Chen <tim.c.chen@linux.intel.com>
To: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	K Prateek Nayak <kprateek.nayak@amd.com>,
	"Gautham R . Shenoy" <gautham.shenoy@amd.com>,
	Vincent Guittot <vincent.guittot@linaro.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>,
	Juri Lelli <juri.lelli@redhat.com>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>,
	Mel Gorman <mgorman@suse.de>,
	Valentin Schneider <vschneid@redhat.com>,
	Madadi Vineeth Reddy <vineethr@linux.ibm.com>,
	Hillf Danton <hdanton@sina.com>,
	Shrikanth Hegde <sshegde@linux.ibm.com>,
	Jianyong Wu <jianyong.wu@outlook.com>,
	Yangyu Chen <cyy@cyyself.name>,
	Tingyin Duan <tingyin.duan@gmail.com>,
	Vern Hao <vernhao@tencent.com>,
	Vern Hao <haoxing990@gmail.com>,
	Len Brown <len.brown@intel.com>,
	Aubrey Li <aubrey.li@intel.com>,
	Zhao Liu <zhao1.liu@intel.com>,
	Chen Yu <yu.chen.surf@gmail.com>,
	Chen Yu <yu.c.chen@intel.com>,
	Adam Li <adamli@os.amperecomputing.com>,
	Aaron Lu <ziqianlu@bytedance.com>,
	Tim Chen <tim.c.chen@intel.com>,
	linux-kernel@vger.kernel.org
Subject: [PATCH v2 06/23] sched/cache: Track LLC-preferred tasks per runqueue
Date: Wed,  3 Dec 2025 15:07:25 -0800
Message-Id: 
 <f086ad5603dca8749678aec805ca13214eea04a8.1764801860.git.tim.c.chen@linux.intel.com>
X-Mailer: git-send-email 2.32.0
In-Reply-To: <cover.1764801860.git.tim.c.chen@linux.intel.com>
References: <cover.1764801860.git.tim.c.chen@linux.intel.com>
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="utf-8"

For each runqueue, track the number of tasks with an LLC preference
and how many of them are running on their preferred LLC. This mirrors
nr_numa_running and nr_preferred_running for NUMA balancing, and will
be used by cache-aware load balancing in later patches.

Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
---

Notes:
    v1->v2: Invoke task_of() once and reuse its result afterwards.
            (Peter Zijlstra)
            Remove hacky reset_llc_stats() and introduce sched_llc_active f=
lag
            to properly pair enqueue/dequeue statistics update (Peter Zijls=
tra, K Prateek Nayak)

 include/linux/sched.h |  2 ++
 init/init_task.c      |  1 +
 kernel/sched/core.c   |  5 ++++
 kernel/sched/fair.c   | 60 ++++++++++++++++++++++++++++++++++++++++---
 kernel/sched/sched.h  |  6 +++++
 5 files changed, 71 insertions(+), 3 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 1ad46220cd04..466ba8b7398c 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1408,6 +1408,8 @@ struct task_struct {
=20
 #ifdef CONFIG_SCHED_CACHE
 	struct callback_head		cache_work;
+	/*the p is currently refcounted in a rq's preferred llc stats*/
+	bool				sched_llc_active;
 	int				preferred_llc;
 #endif
=20
diff --git a/init/init_task.c b/init/init_task.c
index 44bae72b5b7d..ee78837b0aa2 100644
--- a/init/init_task.c
+++ b/init/init_task.c
@@ -192,6 +192,7 @@ struct task_struct init_task __aligned(L1_CACHE_BYTES) =
=3D {
 	.numa_faults	=3D NULL,
 #endif
 #ifdef CONFIG_SCHED_CACHE
+	.sched_llc_active =3D false,
 	.preferred_llc  =3D -1,
 #endif
 #if defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index e8bdf03a4b7f..48626c81ba8e 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -531,6 +531,11 @@ void __trace_set_current_state(int state_value)
 }
 EXPORT_SYMBOL(__trace_set_current_state);
=20
+int task_llc(const struct task_struct *p)
+{
+	return per_cpu(sd_llc_id, task_cpu(p));
+}
+
 /*
  * Serialization rules:
  *
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 10cec83f65d5..d46a70a9d9fb 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1223,6 +1223,43 @@ static int llc_id(int cpu)
 	return llc;
 }
=20
+static void account_llc_enqueue(struct rq *rq, struct task_struct *p)
+{
+	int pref_llc;
+
+	if (!sched_cache_enabled())
+		return;
+
+	pref_llc =3D p->preferred_llc;
+	if (pref_llc < 0)
+		return;
+
+	rq->nr_llc_running++;
+	rq->nr_pref_llc_running +=3D (pref_llc =3D=3D task_llc(p));
+	p->sched_llc_active =3D true;
+}
+
+static void account_llc_dequeue(struct rq *rq, struct task_struct *p)
+{
+	int pref_llc;
+
+	/*
+	 * Borrow the uc_se->active from uclamp_rq_inc_id(),
+	 * uclamp_rq_dec_id() to avoid the unbalanced calculation
+	 * of rq statistics.
+	 */
+	if (unlikely(!p->sched_llc_active))
+		return;
+
+	pref_llc =3D p->preferred_llc;
+	if (pref_llc < 0)
+		return;
+
+	rq->nr_llc_running--;
+	rq->nr_pref_llc_running -=3D (pref_llc =3D=3D task_llc(p));
+	p->sched_llc_active =3D false;
+}
+
 void mm_init_sched(struct mm_struct *mm, struct mm_sched __percpu *_pcpu_s=
ched)
 {
 	unsigned long epoch;
@@ -1294,6 +1331,8 @@ static unsigned long __no_profile fraction_mm_sched(s=
truct rq *rq, struct mm_sch
 	return div64_u64(NICE_0_LOAD * pcpu_sched->runtime, rq->cpu_runtime + 1);
 }
=20
+static unsigned int task_running_on_cpu(int cpu, struct task_struct *p);
+
 static inline
 void account_mm_sched(struct rq *rq, struct task_struct *p, s64 delta_exec)
 {
@@ -1346,8 +1385,13 @@ void account_mm_sched(struct rq *rq, struct task_str=
uct *p, s64 delta_exec)
 #endif
 	}
=20
-	if (p->preferred_llc !=3D mm_sched_llc)
+	/* task not on rq accounted later in account_entity_enqueue() */
+	if (task_running_on_cpu(rq->cpu, p) &&
+	    p->preferred_llc !=3D mm_sched_llc) {
+		account_llc_dequeue(rq, p);
 		p->preferred_llc =3D mm_sched_llc;
+		account_llc_enqueue(rq, p);
+	}
 }
=20
 static void task_tick_cache(struct rq *rq, struct task_struct *p)
@@ -1475,6 +1519,10 @@ void init_sched_mm(struct task_struct *p) { }
=20
 static void task_tick_cache(struct rq *rq, struct task_struct *p) { }
=20
+static void account_llc_enqueue(struct rq *rq, struct task_struct *p) {}
+
+static void account_llc_dequeue(struct rq *rq, struct task_struct *p) {}
+
 #endif
=20
 /*
@@ -3965,9 +4013,11 @@ account_entity_enqueue(struct cfs_rq *cfs_rq, struct=
 sched_entity *se)
 {
 	update_load_add(&cfs_rq->load, se->load.weight);
 	if (entity_is_task(se)) {
+		struct task_struct *p =3D task_of(se);
 		struct rq *rq =3D rq_of(cfs_rq);
=20
-		account_numa_enqueue(rq, task_of(se));
+		account_numa_enqueue(rq, p);
+		account_llc_enqueue(rq, p);
 		list_add(&se->group_node, &rq->cfs_tasks);
 	}
 	cfs_rq->nr_queued++;
@@ -3978,7 +4028,11 @@ account_entity_dequeue(struct cfs_rq *cfs_rq, struct=
 sched_entity *se)
 {
 	update_load_sub(&cfs_rq->load, se->load.weight);
 	if (entity_is_task(se)) {
-		account_numa_dequeue(rq_of(cfs_rq), task_of(se));
+		struct task_struct *p =3D task_of(se);
+		struct rq *rq =3D rq_of(cfs_rq);
+
+		account_numa_dequeue(rq, p);
+		account_llc_dequeue(rq, p);
 		list_del_init(&se->group_node);
 	}
 	cfs_rq->nr_queued--;
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 728737641847..ee8b70647835 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1126,6 +1126,10 @@ struct rq {
 	unsigned int		nr_preferred_running;
 	unsigned int		numa_migrate_on;
 #endif
+#ifdef CONFIG_SCHED_CACHE
+	unsigned int		nr_pref_llc_running;
+	unsigned int		nr_llc_running;
+#endif
 #ifdef CONFIG_NO_HZ_COMMON
 	unsigned long		last_blocked_load_update_tick;
 	unsigned int		has_blocked_load;
@@ -1980,6 +1984,8 @@ init_numa_balancing(u64 clone_flags, struct task_stru=
ct *p)
=20
 #endif /* !CONFIG_NUMA_BALANCING */
=20
+int task_llc(const struct task_struct *p);
+
 static inline void
 queue_balance_callback(struct rq *rq,
 		       struct balance_callback *head,
--=20
2.32.0