From nobody Sat Feb 7 08:44:13 2026 Received: from mail-dy1-f178.google.com (mail-dy1-f178.google.com [74.125.82.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E405718DB37 for ; Fri, 23 Jan 2026 09:14:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.178 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769159655; cv=none; b=bDoTD4KAYI6p9AwV/SJ2rmS39TaXb20YVtkZoNptWzj382pyzBYserj+dIK9iBCtIpzQRGouWrQAMzjitPVM/1KFqSP6LbM0q/tQhGuIRPje90nYrX1NcBAjIdanW9xkQVB+ybodeBql+S901EcxjPU1LHRFZHXj/S6ARHVrflY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769159655; c=relaxed/simple; bh=VMT/Q0BbktkiyMxHjOiUO1b4bCGfyKPu88QrD63dIrw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=R5yKQdtYOd1tp6SXos5GBJLlxayZpHIUpJdpRuLpbEKATphQDtg5fqKcVkH90/eWH1TgdItBYC+bGEW7il+cNZpFILWHYcnMY6dEsOhgK+1e9lFjk2RZSx2ucO6RyINiWex2vjgm4/1Ff8QoVhPrXXhk2SpVRXZ249QLiftanUM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=J89Qtl6G; arc=none smtp.client-ip=74.125.82.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="J89Qtl6G" Received: by mail-dy1-f178.google.com with SMTP id 5a478bee46e88-2b4520f6b32so3073622eec.0 for ; Fri, 23 Jan 2026 01:14:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1769159650; x=1769764450; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Je3YYGutsd0AjuSHG3ZedUMgHZBe+yEo7hfmh6uk6EA=; b=J89Qtl6G1QH/YGku5XLattIy17zsxl8cC7ATBPCfFFpLoKYtJikKOhgyJruFZGgZ4H NVyt4sNuXs+tb+qblAdeMaH4IsRhfEN3t1wJyisMFLkD8+z0isr0WQ+TyY29rETmKHRo xMC0D0iciIGt2/9tz0PccqJw0EYHx/dIcu4613pLNagOOSOBW25KKRHgsc9VZSyCEhvb QT1hZGxmJnoQY9ebgISQXYJTgX9cJkWOF/bANuqI2xhaWhjJ15CvNd2NvicoQlakriSt Y3/3OGl5zfsmIj+aEs/qD2rmEizG//d3EL2YTQu3evgYeJUsaEbJ79Tjw2DM/rW5YHa0 cPuw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1769159650; x=1769764450; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=Je3YYGutsd0AjuSHG3ZedUMgHZBe+yEo7hfmh6uk6EA=; b=DC2Du8t2xUn9QaNIuMx12Oq0LOurQyUOQvdHHDfFQdVL5Wd4s5WOxqte3KrBBqSrJX fDG8tvzz8X76HjQUvqIHnwXV2OIbbDsSuqPlTxgq7MXdyBGwI2iO3cEfnFxP2G4AwenU zmeDfh+jEWhLxkkPZClf7/r4VPTtiIMeQ4IefojGpq3SBGK88ndBYiXvmpHwHm2VgLXR VLt1053s6wSnOJeaf2DV7x+8y3GdaZ84Ay5VxPUFPW8d+dOvHr9JgrkhO3P2ctmaGAsx 8XPcIFVjYPd/iF+4K0LvBjZLbOy+jVjGFdVP0A3U7uuuI9LaGVp9RykPXAnWlUDxYpf/ 8EKw== X-Forwarded-Encrypted: i=1; AJvYcCWdDLDiYCg6tc/RXX4wOw9cUCKYXMHbKLz+f6BOr3LFs+M+mO308XKMyKpPilP/SiTzHT0WiGqv7CKyTvc=@vger.kernel.org X-Gm-Message-State: AOJu0Yyfut+6ysjnrfGoylKiocjb1PUwK24/OwQKnb1MqRJE6fPclXTd b0pe4mBDuJdM7IK4do93zLYYkYaFfVc7OdWA/+IraYI/f7KwrPFf6WqK X-Gm-Gg: AZuq6aKn8ihgdYd2xISvkANobznwpe4EEN9FBgKli5CGI/N8c5iLiAuwhR4SYNOBjKq rMmAeoKxiJx4qX4Jxfgl+GfdnE3AVwB3umj3QzE8CU5yuRbrDmXkNxgpNdwHmml8jt3yrSk/HY1 9VvuCB/X6bIZK8hBl7pdKZbmRSUICF/CZybKWgtydtf4thpuXaFKnamc0TkIr0F4UVjrkotwJf1 jhKWeLZQcFccBi0DYXkx+ZPZRatVmrGncR0NuPWareCThHUmSW/b8O0hW5MOWINeXRjSikn4sIv gj+QeInSAPbkvIQYjzJM78MCY3FZBfetkKJSzvG/ox8NNY8ZxyXY4PNcrOYPRpmYe/bCRQtG+WS YkPFrxd9RKfS5Xl5VK5h9w4t+PBMAu5Ygv71bFWlSEUySkQWO5g+Ob+Gd1/knNfSsNUvV2ourRi tntcln3hZmzDX3tw== X-Received: by 2002:a05:7301:4e0b:b0:2af:b9af:ea7a with SMTP id 5a478bee46e88-2b73995ef73mr980855eec.3.1769159650410; Fri, 23 Jan 2026 01:14:10 -0800 (PST) Received: from debian ([74.48.213.230]) by smtp.gmail.com with ESMTPSA id 5a478bee46e88-2b73a9fbdcesm2528280eec.25.2026.01.23.01.14.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 23 Jan 2026 01:14:10 -0800 (PST) From: Qiliang Yuan To: christian.loehle@arm.com Cc: bsegall@google.com, dietmar.eggemann@arm.com, juri.lelli@redhat.com, linux-kernel@vger.kernel.org, mgorman@suse.de, mingo@redhat.com, peterz@infradead.org, realwujing@gmail.com, rostedt@goodmis.org, vincent.guittot@linaro.org, vschneid@redhat.com, yuanql9@chinatelecom.cn Subject: [PATCH v2] sched/fair: Optimize EAS energy calculation complexity from O(N) to O(1) inside inner loop Date: Fri, 23 Jan 2026 04:14:01 -0500 Message-ID: <20260123091402.675730-1-realwujing@gmail.com> X-Mailer: git-send-email 2.51.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Pre-calculate the base maximum utilization of each performance domain durin= g the main loop of find_energy_efficient_cpu() and cache it in the local 'energy_env' structure. By caching this base value, the maximum utilization for candidate CPU placements (such as prev_cpu and max_spare_cap_cpu) can be determined in O(1) time, eliminating redundant scans of the performance domain. This optimizes the energy estimation path by reducing the number of scans per performance domain from three to one. This change significantly reduces wake-up latency on systems with high core counts or complex performance domain topologies by minimizing the overall complexity of the Energy-Aware Scheduling (EAS) calculation. Signed-off-by: Qiliang Yuan Signed-off-by: Qiliang Yuan --- v2: - Ensure RCU safety by using local 'energy_env' for caching instead of modifying the shared 'perf_domain' structure. - Consolidate pre-calculation into the main loop to avoid an extra pass over the performance domains. v1: - Optimize energy calculation by pre-calculating performance domain max ut= ilization. - Add max_util and max_spare_cap_cpu to struct perf_domain. - Reduce inner loop complexity from O(N) to O(1) for energy estimation. kernel/sched/fair.c | 36 ++++++++++++++++++------------------ 1 file changed, 18 insertions(+), 18 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index e71302282671..5c114c49c202 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -8148,6 +8148,7 @@ struct energy_env { unsigned long pd_busy_time; unsigned long cpu_cap; unsigned long pd_cap; + unsigned long pd_max_util; }; =20 /* @@ -8215,41 +8216,32 @@ static inline void eenv_pd_busy_time(struct energy_= env *eenv, * exceed @eenv->cpu_cap. */ static inline unsigned long -eenv_pd_max_util(struct energy_env *eenv, struct cpumask *pd_cpus, +eenv_pd_max_util(struct energy_env *eenv, struct perf_domain *pd, struct task_struct *p, int dst_cpu) { - unsigned long max_util =3D 0; - int cpu; + unsigned long max_util =3D eenv->pd_max_util; =20 - for_each_cpu(cpu, pd_cpus) { - struct task_struct *tsk =3D (cpu =3D=3D dst_cpu) ? p : NULL; - unsigned long util =3D cpu_util(cpu, p, dst_cpu, 1); + if (dst_cpu >=3D 0 && cpumask_test_cpu(dst_cpu, perf_domain_span(pd))) { + unsigned long util =3D cpu_util(dst_cpu, p, dst_cpu, 1); unsigned long eff_util, min, max; =20 - /* - * Performance domain frequency: utilization clamping - * must be considered since it affects the selection - * of the performance domain frequency. - * NOTE: in case RT tasks are running, by default the min - * utilization can be max OPP. - */ - eff_util =3D effective_cpu_util(cpu, util, &min, &max); + eff_util =3D effective_cpu_util(dst_cpu, util, &min, &max); =20 /* Task's uclamp can modify min and max value */ - if (tsk && uclamp_is_used()) { + if (uclamp_is_used()) { min =3D max(min, uclamp_eff_value(p, UCLAMP_MIN)); =20 /* * If there is no active max uclamp constraint, * directly use task's one, otherwise keep max. */ - if (uclamp_rq_is_idle(cpu_rq(cpu))) + if (uclamp_rq_is_idle(cpu_rq(dst_cpu))) max =3D uclamp_eff_value(p, UCLAMP_MAX); else max =3D max(max, uclamp_eff_value(p, UCLAMP_MAX)); } =20 - eff_util =3D sugov_effective_cpu_perf(cpu, eff_util, min, max); + eff_util =3D sugov_effective_cpu_perf(dst_cpu, eff_util, min, max); max_util =3D max(max_util, eff_util); } =20 @@ -8265,7 +8257,7 @@ static inline unsigned long compute_energy(struct energy_env *eenv, struct perf_domain *pd, struct cpumask *pd_cpus, struct task_struct *p, int dst_cpu) { - unsigned long max_util =3D eenv_pd_max_util(eenv, pd_cpus, p, dst_cpu); + unsigned long max_util =3D eenv_pd_max_util(eenv, pd, p, dst_cpu); unsigned long busy_time =3D eenv->pd_busy_time; unsigned long energy; =20 @@ -8376,12 +8368,20 @@ static int find_energy_efficient_cpu(struct task_st= ruct *p, int prev_cpu) =20 eenv.cpu_cap =3D cpu_actual_cap; eenv.pd_cap =3D 0; + eenv.pd_max_util =3D 0; =20 for_each_cpu(cpu, cpus) { struct rq *rq =3D cpu_rq(cpu); + unsigned long util_b, eff_util_b, min_b, max_b; =20 eenv.pd_cap +=3D cpu_actual_cap; =20 + /* Pre-calculate base max utilization for the performance domain */ + util_b =3D cpu_util(cpu, p, -1, 1); + eff_util_b =3D effective_cpu_util(cpu, util_b, &min_b, &max_b); + eff_util_b =3D sugov_effective_cpu_perf(cpu, eff_util_b, min_b, max_b); + eenv.pd_max_util =3D max(eenv.pd_max_util, eff_util_b); + if (!cpumask_test_cpu(cpu, sched_domain_span(sd))) continue; =20 --=20 2.51.0