From nobody Thu Apr 2 16:58:15 2026 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.9]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CBA8E33B6F1 for ; Tue, 10 Feb 2026 22:13:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.9 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770761601; cv=none; b=CyA6GLqyzy6qPy7eM4z2myjYokiNNkI64nDL55uilQdQ44nReATUU0WJy1Q5ZOQyrozMYUccXMx7nJ1/f3hIsqOmPD6D+MfrFRQTVcSUkyvkc1h82SxL9SdVu1dNh+hgkeIZ5Nz48162B6emxzem4IetLnCroJsFJiCK3VXYdbU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770761601; c=relaxed/simple; bh=DWu++QLbzyBJytYKRjnoJfPb+BN1oEQx/b7EvvlxTjU=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=UPmDMY7DZmek4jpDEk9kbA3dy3GKc42jKcLfbDlsd7+baPsRtTertpWK9PlKbiwVW9caLZaEopZoeMv3voeMgEdvJYB/V6cwcF0Cf0dvqHUVtCgctFV5rmNGzyAY05TkBQvqsHGoKydkrmrPU23AHwd4uhX0UDmdKB9k9OPKEW4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=pass smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=Ia+udd6i; arc=none smtp.client-ip=192.198.163.9 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Ia+udd6i" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1770761600; x=1802297600; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=DWu++QLbzyBJytYKRjnoJfPb+BN1oEQx/b7EvvlxTjU=; b=Ia+udd6iPmC7VvGvKj9nfKB590Fn0f3t/zbTKGBt8wyYMWJteM1tvXqD LCkHRPnSdiZzAnOnwa3cd+cwjM/HjpP4qdbIvCETqhFYWNwyKRAgJPE42 9j7insbiLgmu91WObHds7IyMUEFY/nUOIvSTXGdZEbVbG5dLoRNkz6pF3 DS2yN1owhEEgMdXdGf2Vb5Q7IJf8AAVsx1bTGdCZCm4wGW2tRwuV3V9bv nsaWMCtJRlc6q5rjokIQLcKlxNDjmo8VSpmnO+iLC+wDXj74PsgfMZGYO HYT/CzSekrrP0APectfheWtmCvjTKHgYdW3neuFbbkqbJqU9JSToZHJ5w Q==; X-CSE-ConnectionGUID: MvzCwiLATdCb3BjfRNzGdQ== X-CSE-MsgGUID: rMJBZNE1QpeS/NodwVY+CQ== X-IronPort-AV: E=McAfee;i="6800,10657,11697"; a="82631378" X-IronPort-AV: E=Sophos;i="6.21,283,1763452800"; d="scan'208";a="82631378" Received: from fmviesa004.fm.intel.com ([10.60.135.144]) by fmvoesa103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Feb 2026 14:13:20 -0800 X-CSE-ConnectionGUID: +kW0xOefQL+p/3/t8mxyqg== X-CSE-MsgGUID: HxX5F/csQZSuBNIc0rmErw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.21,283,1763452800"; d="scan'208";a="216373939" Received: from b04f130c83f2.jf.intel.com ([10.165.154.98]) by fmviesa004.fm.intel.com with ESMTP; 10 Feb 2026 14:13:18 -0800 From: Tim Chen To: Peter Zijlstra , Ingo Molnar , K Prateek Nayak , "Gautham R . Shenoy" , Vincent Guittot Cc: Tim Chen , Juri Lelli , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Madadi Vineeth Reddy , Hillf Danton , Shrikanth Hegde , Jianyong Wu , Yangyu Chen , Tingyin Duan , Vern Hao , Vern Hao , Len Brown , Aubrey Li , Zhao Liu , Chen Yu , Chen Yu , Adam Li , Aaron Lu , Tim Chen , Josh Don , Gavin Guo , Qais Yousef , Libo Chen , linux-kernel@vger.kernel.org Subject: [PATCH v3 08/21] sched/cache: Calculate the percpu sd task LLC preference Date: Tue, 10 Feb 2026 14:18:48 -0800 Message-Id: <41f8e91b70060e7697840163b80c3dc097aabb34.1770760558.git.tim.c.chen@linux.intel.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Calculate the number of tasks' LLC preferences for each runqueue. This statistic is computed during task enqueue and dequeue operations, and is used by the cache-aware load balancing. Co-developed-by: Chen Yu Signed-off-by: Chen Yu Signed-off-by: Tim Chen --- Notes: v2->v3: Move max_llcs check from patch4 to this patch. This would clarify the rationale for the max_llc check and makes review easier (Peter Zijlstra). kernel/sched/fair.c | 56 +++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 54 insertions(+), 2 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 6ad9ad2f918f..4a98aa866d65 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -1199,28 +1199,80 @@ static int llc_id(int cpu) return per_cpu(sd_llc_id, cpu); } =20 +static inline bool valid_llc_id(int id) +{ + if (unlikely(id < 0 || id >=3D max_llcs)) + return false; + + return true; +} + +static inline bool valid_llc_buf(struct sched_domain *sd, + int id) +{ + /* + * The check for sd and its corresponding pf is to + * confirm that the sd->pf[] has been allocated in + * build_sched_domains() after the assignment of + * per_cpu(sd_llc_id, i). This is used to avoid + * the race condition. + */ + if (unlikely(!sd || !sd->pf)) + return false; + + return valid_llc_id(id); +} + static void account_llc_enqueue(struct rq *rq, struct task_struct *p) { + struct sched_domain *sd; int pref_llc; =20 pref_llc =3D p->preferred_llc; - if (pref_llc < 0) + if (!valid_llc_id(pref_llc)) return; =20 rq->nr_llc_running++; rq->nr_pref_llc_running +=3D (pref_llc =3D=3D task_llc(p)); + + scoped_guard (rcu) { + sd =3D rcu_dereference(rq->sd); + if (valid_llc_buf(sd, pref_llc)) + sd->pf[pref_llc]++; + } } =20 static void account_llc_dequeue(struct rq *rq, struct task_struct *p) { + struct sched_domain *sd; int pref_llc; =20 pref_llc =3D p->preferred_llc; - if (pref_llc < 0) + if (!valid_llc_id(pref_llc)) return; =20 rq->nr_llc_running--; rq->nr_pref_llc_running -=3D (pref_llc =3D=3D task_llc(p)); + + scoped_guard (rcu) { + sd =3D rcu_dereference(rq->sd); + if (valid_llc_buf(sd, pref_llc)) { + /* + * There is a race condition between dequeue + * and CPU hotplug. After a task has been enqueued + * on CPUx, a CPU hotplug event occurs, and all online + * CPUs (including CPUx) rebuild their sched_domains + * and reset statistics to zero (including sd->pf). + * This can cause temporary undercount and we have to + * check for such underflow in sd->pf. + * + * This undercount is temporary and accurate accounting + * will resume once the rq has a chance to be idle. + */ + if (sd->pf[pref_llc]) + sd->pf[pref_llc]--; + } + } } =20 void mm_init_sched(struct mm_struct *mm, --=20 2.32.0