From nobody Sat Feb 7 15:22:26 2026 Received: from mail-dl1-f54.google.com (mail-dl1-f54.google.com [74.125.82.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AF6821400C for ; Mon, 2 Feb 2026 03:05:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.54 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770001524; cv=none; b=nj4KzmKDvOAH4knvzOAeC5O5pZveNMBqPRnJAMxYbrfhHjteIUW3lEjGjPB3T5iSpJJdeLXa/6RgVco1y2QVfano2uSLNBSPnThGJFes9H0/bIsViaW7FfwA0Tf6BnTcbBdo2e7hJMqvuWQrNxkNwWxLpoV6ZUOuXymhrieZJhc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770001524; c=relaxed/simple; bh=VMT/Q0BbktkiyMxHjOiUO1b4bCGfyKPu88QrD63dIrw=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=PvN9lNA0PDFB7MuoBBm1WUU0J1DaCKKzvNegwPcWKavKKYbSdLkihjFd8SqvA+Of+v3KKr2pwaVe4FeMchplXHNTkOqeRIsrJBamVWhR7mN6cyO7VkJQBZSVhzeBhsRzbnlh/QYxGG5yJmn20HqLaeVp/yQ0J8VMVwWqQVnGAg4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=X/MTF//o; arc=none smtp.client-ip=74.125.82.54 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="X/MTF//o" Received: by mail-dl1-f54.google.com with SMTP id a92af1059eb24-124566b6693so6413959c88.0 for ; Sun, 01 Feb 2026 19:05:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1770001522; x=1770606322; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=Je3YYGutsd0AjuSHG3ZedUMgHZBe+yEo7hfmh6uk6EA=; b=X/MTF//obYstOCEJw+dORKos07hSODFODarCGGGoFP/P5eTFN5UVo7fSi1DdtmHzCo iF9ioMs/fzSQrqMxZqgg/J7oZOyFuubSbM9EV+AebW1etPvTBqMiEZEdoXz+rZwinqEV c5Z2ZJWklngQI0OFY6W/EggjaZ0hJJlo3G2GKKfudkUazY2jDRyBppcCneoJTG1lRwKW vQ1pO6iHVTwNbkejn/nFcEhIr4T8YPTXrBEhqpeYDm8eEdIn5gMI+Ktm1mdM+H8GfwBP VLuf1sCAH/eEFcz0VZOUA7zgaETEq0Hz3/vOlhM+324XBiy8ZDEhn6jD0MZ6iroNHKws 0s8A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1770001522; x=1770606322; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=Je3YYGutsd0AjuSHG3ZedUMgHZBe+yEo7hfmh6uk6EA=; b=VFS7tdiPjXMAAhrX8CTdE0d5PrmvnreWUsOhaaH8NaXlNjtrU8ev4ZuXyY/SskfvGN xOFSlmAbbH8eBOBRB2y2YnOKqKO20959miJubL1MHBtkNnVVjE8FRtXo/PVjvxCBKYem BgxQHnmN41uVuVDaJZmkD+duVGE5jxYudEAa2OwPJGOOPE7lBwQxlWOjAQCEz2ffUQMO P/Ag2gHsUYpsLUHRp5pGMUPVgSq45umjMUZBtNiO5ih7CKkGaZdMh57E3WJx2NRNjCiw KwGuHqBwdGlp/RvI6ciAs66XSTugNSnosUwM+YyNlfnKpbHZY+Ft3uNwFetX+cC3mMZl Sd9A== X-Forwarded-Encrypted: i=1; AJvYcCVzRi83aiY/oggCYtorQil2pFZBYvwyHhW7stDYY1RltkplnqZGMDo/leGSAWWWxNGE2BYamzf3k9AdMkA=@vger.kernel.org X-Gm-Message-State: AOJu0YxuRqkRTxvNXTiYdn1WymBNyEM4a9BEiQ0O4F85FNnDriq148Tu 3ec4l5LI2g6imMSDx7mclsc5k7d7n5ZNOP0rJPCnELZ93Me8Wc2MKoc6 X-Gm-Gg: AZuq6aLlst6lvgLL16F41uEb7/5w3r8lZ2gjOIbQNihunZcfdRqqGx5UdEfjy3ov0Eu OFHQxdqP5lWf2HaDTKrSAOZfBHZCHqG+KJt5dW6oOkEPEkiKEmbfBuL0CLtBbUxno/FvN34BtQ+ G88TcF6R6Spu21CA6kGtcDC5ve1VoRcft7StDG/5PXVYEY6KxJbjw6EE4PTE2TT9fFVLIgUl9zk DNBQ0lOGwClvppAQC/JRAyoO3IozrxLERHHtLkAqEiXx5jl7ayd4m20aodiQ3taIrYuFuyKzYAe j+joTL6Ft7fZW++PlnhQSj37fvWRaKZWoUT2dsK0+CherK1lZLZ+cX8+EzTa4y1UV4J6rnrTDrz H7drMGDAnOUFlDaKrZpRTEesOyCIO3TZ09+aLEfyrltLWAun0enFrV8/xRTo02l88HvyoZUHLXQ OcAWM= X-Received: by 2002:a05:7022:40b:b0:11b:f056:a19b with SMTP id a92af1059eb24-125c0ff31afmr5402720c88.18.1770001521630; Sun, 01 Feb 2026 19:05:21 -0800 (PST) Received: from debian ([74.48.213.230]) by smtp.gmail.com with ESMTPSA id 5a478bee46e88-2b7a16cfa25sm19163787eec.5.2026.02.01.19.05.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 01 Feb 2026 19:05:21 -0800 (PST) From: Qiliang Yuan To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot Cc: Qiliang Yuan , Qiliang Yuan , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , linux-kernel@vger.kernel.org Subject: [PATCH v2 RSEND] sched/fair: Optimize EAS energy calculation complexity from O(N) to O(1) inside inner loop Date: Sun, 1 Feb 2026 22:05:09 -0500 Message-ID: <20260202030512.2792311-1-realwujing@gmail.com> X-Mailer: git-send-email 2.51.0 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Pre-calculate the base maximum utilization of each performance domain durin= g the main loop of find_energy_efficient_cpu() and cache it in the local 'energy_env' structure. By caching this base value, the maximum utilization for candidate CPU placements (such as prev_cpu and max_spare_cap_cpu) can be determined in O(1) time, eliminating redundant scans of the performance domain. This optimizes the energy estimation path by reducing the number of scans per performance domain from three to one. This change significantly reduces wake-up latency on systems with high core counts or complex performance domain topologies by minimizing the overall complexity of the Energy-Aware Scheduling (EAS) calculation. Signed-off-by: Qiliang Yuan Signed-off-by: Qiliang Yuan --- v2: - Ensure RCU safety by using local 'energy_env' for caching instead of modifying the shared 'perf_domain' structure. - Consolidate pre-calculation into the main loop to avoid an extra pass over the performance domains. v1: - Optimize energy calculation by pre-calculating performance domain max ut= ilization. - Add max_util and max_spare_cap_cpu to struct perf_domain. - Reduce inner loop complexity from O(N) to O(1) for energy estimation. kernel/sched/fair.c | 36 ++++++++++++++++++------------------ 1 file changed, 18 insertions(+), 18 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index e71302282671..5c114c49c202 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -8148,6 +8148,7 @@ struct energy_env { unsigned long pd_busy_time; unsigned long cpu_cap; unsigned long pd_cap; + unsigned long pd_max_util; }; =20 /* @@ -8215,41 +8216,32 @@ static inline void eenv_pd_busy_time(struct energy_= env *eenv, * exceed @eenv->cpu_cap. */ static inline unsigned long -eenv_pd_max_util(struct energy_env *eenv, struct cpumask *pd_cpus, +eenv_pd_max_util(struct energy_env *eenv, struct perf_domain *pd, struct task_struct *p, int dst_cpu) { - unsigned long max_util =3D 0; - int cpu; + unsigned long max_util =3D eenv->pd_max_util; =20 - for_each_cpu(cpu, pd_cpus) { - struct task_struct *tsk =3D (cpu =3D=3D dst_cpu) ? p : NULL; - unsigned long util =3D cpu_util(cpu, p, dst_cpu, 1); + if (dst_cpu >=3D 0 && cpumask_test_cpu(dst_cpu, perf_domain_span(pd))) { + unsigned long util =3D cpu_util(dst_cpu, p, dst_cpu, 1); unsigned long eff_util, min, max; =20 - /* - * Performance domain frequency: utilization clamping - * must be considered since it affects the selection - * of the performance domain frequency. - * NOTE: in case RT tasks are running, by default the min - * utilization can be max OPP. - */ - eff_util =3D effective_cpu_util(cpu, util, &min, &max); + eff_util =3D effective_cpu_util(dst_cpu, util, &min, &max); =20 /* Task's uclamp can modify min and max value */ - if (tsk && uclamp_is_used()) { + if (uclamp_is_used()) { min =3D max(min, uclamp_eff_value(p, UCLAMP_MIN)); =20 /* * If there is no active max uclamp constraint, * directly use task's one, otherwise keep max. */ - if (uclamp_rq_is_idle(cpu_rq(cpu))) + if (uclamp_rq_is_idle(cpu_rq(dst_cpu))) max =3D uclamp_eff_value(p, UCLAMP_MAX); else max =3D max(max, uclamp_eff_value(p, UCLAMP_MAX)); } =20 - eff_util =3D sugov_effective_cpu_perf(cpu, eff_util, min, max); + eff_util =3D sugov_effective_cpu_perf(dst_cpu, eff_util, min, max); max_util =3D max(max_util, eff_util); } =20 @@ -8265,7 +8257,7 @@ static inline unsigned long compute_energy(struct energy_env *eenv, struct perf_domain *pd, struct cpumask *pd_cpus, struct task_struct *p, int dst_cpu) { - unsigned long max_util =3D eenv_pd_max_util(eenv, pd_cpus, p, dst_cpu); + unsigned long max_util =3D eenv_pd_max_util(eenv, pd, p, dst_cpu); unsigned long busy_time =3D eenv->pd_busy_time; unsigned long energy; =20 @@ -8376,12 +8368,20 @@ static int find_energy_efficient_cpu(struct task_st= ruct *p, int prev_cpu) =20 eenv.cpu_cap =3D cpu_actual_cap; eenv.pd_cap =3D 0; + eenv.pd_max_util =3D 0; =20 for_each_cpu(cpu, cpus) { struct rq *rq =3D cpu_rq(cpu); + unsigned long util_b, eff_util_b, min_b, max_b; =20 eenv.pd_cap +=3D cpu_actual_cap; =20 + /* Pre-calculate base max utilization for the performance domain */ + util_b =3D cpu_util(cpu, p, -1, 1); + eff_util_b =3D effective_cpu_util(cpu, util_b, &min_b, &max_b); + eff_util_b =3D sugov_effective_cpu_perf(cpu, eff_util_b, min_b, max_b); + eenv.pd_max_util =3D max(eenv.pd_max_util, eff_util_b); + if (!cpumask_test_cpu(cpu, sched_domain_span(sd))) continue; =20 --=20 2.51.0