From nobody Sun Apr 12 21:02:22 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.15]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C1CB5339872 for ; Wed, 1 Apr 2026 21:46:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.15 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775080018; cv=none; b=NI+mDp+WTEq63DtC1cQuxueXwCJlSzlm/oPpDa6es5ol/RKoobZQu+xy+BffDDpWueR8p7jxZzo5k13bunl4AthnegaEnbrZ/ZhDGGFdKX8NMtbE19y9TS3h0vZahHPsvgqr6h3UnfEc9nZ/7J41jm0hauP3Q29Snq8U2ROUHgs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775080018; c=relaxed/simple; bh=8tuOAyRbCgTWSQh8afC5kV/7brREb9W72a0pHljBBko=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Vd0qT2nYwuZatileqKY5kHiRGjUfMakO6ifD2aFv0ElceTuVU9X80xhLPxDa2gChrqhmunGvtjTMOJ5/ffBhAnfyy66HvaBOcbeawCg/IDFXjDNVOiUCNs8LrIdUvM0lLP6NoPYSrzoRxhrlu0SMEoVlSQpqd2XdhCEJsguxAKY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=pass smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=BSQ4NPba; arc=none smtp.client-ip=198.175.65.15 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="BSQ4NPba" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1775080017; x=1806616017; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=8tuOAyRbCgTWSQh8afC5kV/7brREb9W72a0pHljBBko=; b=BSQ4NPbagadi/0kZ6SKS6OWVyApoUuOdUek3kSbyFEvvNth5934u6W3X Njf0M1+viT8fErE5svMyPFUfXIAQK654OMfV0ip4F0a0/z8zS/NdrjS5v IqZ/twyMEMTmveygbjpttEJBTfeRcWZ4TV1wxND6fSMH0Wt9zyi2f5gOs Culm3JZUl0N1Ww+tJj7xRXjYtIdYVCHWTXOmDUSLgs6PRXqKFIQ6VU6ok FGp/rJdTliQmCnJV6T5lwvwiP3sKDU8iwfN98lUFiFEJPkmlf89PknVO8 To1b1AykdHyAuh/34sJjSW9JoIoqKID8oi8VpDjeaDqoj+ajCMsLHvF9R g==; X-CSE-ConnectionGUID: Ug9F0h5TQXSAkVwoRv5jNg== X-CSE-MsgGUID: fi+Q/e9DSuu3WYeu3kq2BQ== X-IronPort-AV: E=McAfee;i="6800,10657,11746"; a="79739879" X-IronPort-AV: E=Sophos;i="6.23,153,1770624000"; d="scan'208";a="79739879" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa107.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Apr 2026 14:46:57 -0700 X-CSE-ConnectionGUID: JOhAnu2nRgusV+LnItC5kg== X-CSE-MsgGUID: jQNY9A89QfeICVHoac41tg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.23,153,1770624000"; d="scan'208";a="249842435" Received: from b04f130c83f2.jf.intel.com ([10.165.154.98]) by fmviesa002.fm.intel.com with ESMTP; 01 Apr 2026 14:46:55 -0700 From: Tim Chen To: Peter Zijlstra , Ingo Molnar , K Prateek Nayak , "Gautham R . Shenoy" , Vincent Guittot Cc: Tim Chen , Juri Lelli , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Madadi Vineeth Reddy , Hillf Danton , Shrikanth Hegde , Jianyong Wu , Yangyu Chen , Tingyin Duan , Vern Hao , Vern Hao , Len Brown , Aubrey Li , Zhao Liu , Chen Yu , Chen Yu , Adam Li , Aaron Lu , Tim Chen , Josh Don , Gavin Guo , Qais Yousef , Libo Chen , linux-kernel@vger.kernel.org Subject: [Patch v4 09/22] sched/cache: Calculate the percpu sd task LLC preference Date: Wed, 1 Apr 2026 14:52:21 -0700 Message-Id: X-Mailer: git-send-email 2.32.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Calculate the number of tasks' LLC preferences for each runqueue. This statistic is computed during task enqueue and dequeue operations, and is used by the cache-aware load balancing. Co-developed-by: Chen Yu Signed-off-by: Chen Yu Signed-off-by: Tim Chen --- Notes: v3->v4: Remove unnecessary rcu_read_lock() in eq/dq as rq lock is held. Use rcu_dereference_all() directly. (Peter Zijlstra) kernel/sched/fair.c | 49 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 49 insertions(+) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 4b760bd604e7..e6474e61f4aa 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -1291,8 +1291,34 @@ static int llc_id(int cpu) return per_cpu(sd_llc_id, cpu); } =20 +static inline bool valid_llc_buf(struct sched_domain *sd, + int id) +{ + /* + * These checks are used to avoid the following + * race causing out-of-range access to llc_counts: + * + * CPU0 CPU1 + * : + * ... + * build_sched_domains update_sg_lb_stats + * for_each_cpu_and(i, sg) + * sd=3Drq[i]->sd + * per_cpu(sd_llc_id,i)=3Dnew_llc + * llc=3Dllc_id(i) + * !!!sd->llc_counts[llc]!!! + * sd->llc_counts=3Dkzalloc() + * sd->llc_max=3Dmax_llc + */ + if (unlikely(id < 0 || !sd || !sd->llc_counts || id > sd->llc_max)) + return false; + + return true; +} + static void account_llc_enqueue(struct rq *rq, struct task_struct *p) { + struct sched_domain *sd; int pref_llc; =20 pref_llc =3D p->preferred_llc; @@ -1301,10 +1327,15 @@ static void account_llc_enqueue(struct rq *rq, stru= ct task_struct *p) =20 rq->nr_llc_running++; rq->nr_pref_llc_running +=3D (pref_llc =3D=3D task_llc(p)); + + sd =3D rcu_dereference_all(rq->sd); + if (valid_llc_buf(sd, pref_llc)) + sd->llc_counts[pref_llc]++; } =20 static void account_llc_dequeue(struct rq *rq, struct task_struct *p) { + struct sched_domain *sd; int pref_llc; =20 pref_llc =3D p->preferred_llc; @@ -1313,6 +1344,24 @@ static void account_llc_dequeue(struct rq *rq, struc= t task_struct *p) =20 rq->nr_llc_running--; rq->nr_pref_llc_running -=3D (pref_llc =3D=3D task_llc(p)); + + sd =3D rcu_dereference_all(rq->sd); + if (valid_llc_buf(sd, pref_llc)) { + /* + * There is a race condition between dequeue + * and CPU hotplug. After a task has been enqueued + * on CPUx, a CPU hotplug event occurs, and all online + * CPUs (including CPUx) rebuild their sched_domains + * and reset statistics to zero(including sd->llc_counts). + * This can cause temporary undercount and we have to + * check for such underflow in sd->llc_counts. + * + * This undercount is temporary and accurate accounting + * will resume once the rq has a chance to be idle. + */ + if (sd->llc_counts[pref_llc]) + sd->llc_counts[pref_llc]--; + } } =20 void mm_init_sched(struct mm_struct *mm, --=20 2.32.0