From nobody Mon Feb  9 16:23:52 2026
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id A661CC4167B
	for <linux-kernel@archiver.kernel.org>; Fri,  8 Dec 2023 00:24:11 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1444163AbjLHAYC (ORCPT
        <rfc822;linux-kernel@archiver.kernel.org>);
        Thu, 7 Dec 2023 19:24:02 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57396 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S232621AbjLHAX7 (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Thu, 7 Dec 2023 19:23:59 -0500
Received: from mail-wm1-x336.google.com (mail-wm1-x336.google.com
 [IPv6:2a00:1450:4864:20::336])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 20A601716
        for <linux-kernel@vger.kernel.org>;
 Thu,  7 Dec 2023 16:24:05 -0800 (PST)
Received: by mail-wm1-x336.google.com with SMTP id
 5b1f17b1804b1-40c05ce04a8so17827185e9.0
        for <linux-kernel@vger.kernel.org>;
 Thu, 07 Dec 2023 16:24:05 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=layalina-io.20230601.gappssmtp.com; s=20230601; t=1701995043;
 x=1702599843; darn=vger.kernel.org;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:from:to:cc:subject:date
         :message-id:reply-to;
        bh=RwtZB8Hg/lOBqKmz0+dSFpJe0u9fDzGLjlhIcgb+tpk=;
        b=xj2HujGnKjGY973qXjrALFveNqg4JtB5WrFOCWE7ceaHSrGZv+iId/X7CpuyCnwoqW
         yiBzfntjMJqv8Z0/8oCviPo2TW0ubACokq78I+kvB1W4RNbGgpkGn3OVsYkvwjCdiXAh
         L1GQZIjb+6tTPNGF+/ZbLDAufX9G+l1R/HITvIrdkaKd0/aEiJbU7o32TrdscPQqKbMf
         /i/EfxPDBUmPRUbFH3c5KEl2oVepagtZfN4zqv9MB87Ua991MAz32nq6i+uxF/1ivMHK
         QHDTzgFLekdL6A1Cv5oBxp5ewIVpa+aqOmtc265EGi8HAwBdoyzEljTWoKjbjfZ4//Bc
         NNVg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1701995043; x=1702599843;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc
         :subject:date:message-id:reply-to;
        bh=RwtZB8Hg/lOBqKmz0+dSFpJe0u9fDzGLjlhIcgb+tpk=;
        b=OPVcLJJZxXX7+3XqEwMbwqyJXTP09yjNb5n31t5oSH2sJ+d53K7KdktFUDwKChmtel
         so0ePdsYzYj3vPOsvz67MeeJ1wnrtpXwsz2floJt19CN4LVdU8IGFyc84dUqeVgD/sax
         m8eFx30OwGgAQ1fOLhR4Cx+m6Ru98MSxl+40LerIIvHusorbFXUrcAV5s0ReBVH9lrS9
         xAITXYSS86dlv9t49y5eeANAAi4EwOAmOdIJlkz+1qaG8tCt5NsGkSBSHZcTUIOFluac
         zwFQppJu9fmJ5F3z7KHTnApqnhh131ip4O4gKxCB/2LfHTxfaUDgbz+rgt0bScxlavN1
         iMlg==
X-Gm-Message-State: AOJu0Yxs9VzlQNvqA1n3PI3Eov/unxyWEIXnTZtm/PtOKpVlV78OG66j
        k7Dy9x9SgZlkxhc9IfzKEq6bgA==
X-Google-Smtp-Source: 
 AGHT+IGVtKxjAQHmSHnL7qQgtAYyHt3gL7QtYpXDZBL7rGC1MYgIoikcqJ3lLQPOfBoqfLdZumfpWg==
X-Received: by 2002:a05:600c:4a12:b0:40b:5e59:f70d with SMTP id
 c18-20020a05600c4a1200b0040b5e59f70dmr989742wmp.127.1701995043482;
        Thu, 07 Dec 2023 16:24:03 -0800 (PST)
Received: from airbuntu.. (host109-153-232-45.range109-153.btcentralplus.com.
 [109.153.232.45])
        by smtp.gmail.com with ESMTPSA id
 u17-20020a05600c19d100b0040c1c269264sm3339653wmq.40.2023.12.07.16.24.02
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Thu, 07 Dec 2023 16:24:03 -0800 (PST)
From: Qais Yousef <qyousef@layalina.io>
To: Ingo Molnar <mingo@kernel.org>,
        Peter Zijlstra <peterz@infradead.org>,
        "Rafael J. Wysocki" <rafael@kernel.org>,
        Viresh Kumar <viresh.kumar@linaro.org>,
        Vincent Guittot <vincent.guittot@linaro.org>,
        Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org,
        Lukasz Luba <lukasz.luba@arm.com>, Wei Wang <wvw@google.com>,
        Rick Yiu <rickyiu@google.com>,
        Chung-Kai Mei <chungkai@google.com>,
        Qais Yousef <qyousef@layalina.io>
Subject: [PATCH v2 1/8] cpufreq: Change default transition delay to 2ms
Date: Fri,  8 Dec 2023 00:23:35 +0000
Message-Id: <20231208002342.367117-2-qyousef@layalina.io>
X-Mailer: git-send-email 2.34.1
In-Reply-To: <20231208002342.367117-1-qyousef@layalina.io>
References: <20231208002342.367117-1-qyousef@layalina.io>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Type: text/plain; charset="utf-8"

10ms is too high for today's hardware, even low end ones. This default
end up being used a lot on Arm machines at least. Pine64, mac mini and
pixel 6 all end up with 10ms rate_limit_us when using schedutil, and
it's too high for all of them.

Change the default to 2ms which should be 'pessimistic' enough for worst
case scenario, but not too high for platforms with fast DVFS hardware.

Signed-off-by: Qais Yousef (Google) <qyousef@layalina.io>
---
 drivers/cpufreq/cpufreq.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 934d35f570b7..9875284ca6e4 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -582,11 +582,11 @@ unsigned int cpufreq_policy_transition_delay_us(struc=
t cpufreq_policy *policy)
 		 * for platforms where transition_latency is in milliseconds, it
 		 * ends up giving unrealistic values.
 		 *
-		 * Cap the default transition delay to 10 ms, which seems to be
+		 * Cap the default transition delay to 2 ms, which seems to be
 		 * a reasonable amount of time after which we should reevaluate
 		 * the frequency.
 		 */
-		return min(latency * LATENCY_MULTIPLIER, (unsigned int)10000);
+		return min(latency * LATENCY_MULTIPLIER, (unsigned int)(2*MSEC_PER_SEC));
 	}
=20
 	return LATENCY_MULTIPLIER;
--=20
2.34.1
From nobody Mon Feb  9 16:23:52 2026
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 6C329C4167B
	for <linux-kernel@archiver.kernel.org>; Fri,  8 Dec 2023 00:24:14 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1444177AbjLHAYF (ORCPT
        <rfc822;linux-kernel@archiver.kernel.org>);
        Thu, 7 Dec 2023 19:24:05 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57400 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S232825AbjLHAYB (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Thu, 7 Dec 2023 19:24:01 -0500
Received: from mail-wm1-x331.google.com (mail-wm1-x331.google.com
 [IPv6:2a00:1450:4864:20::331])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 15B3B171C
        for <linux-kernel@vger.kernel.org>;
 Thu,  7 Dec 2023 16:24:07 -0800 (PST)
Received: by mail-wm1-x331.google.com with SMTP id
 5b1f17b1804b1-40c0a074e71so15625835e9.1
        for <linux-kernel@vger.kernel.org>;
 Thu, 07 Dec 2023 16:24:07 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=layalina-io.20230601.gappssmtp.com; s=20230601; t=1701995045;
 x=1702599845; darn=vger.kernel.org;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:from:to:cc:subject:date
         :message-id:reply-to;
        bh=1MICklUThOT75YrGOtGs5vvoxKkTP4eSHGLbFICqC1g=;
        b=kEcXB5ugh4fioltkFcR/kJAvKoJMyioY+Nr+Uga78TYQg9k6s7JKXZZt8wjGA+2l/o
         2SRx2xqrFr+dSRxVTqKSFgYVLH9wnwNezmBXLp41Qtqym7A7Ik7OuVAq4zoPpQv/s3me
         e87F/mbpldoEhv/kqCeb3mPbgqT+Mlzj+5/kL3Qw6TQ6gxLD4vgpqehYHda9fvukve+d
         E2O0X6GDWYPfQi++hNwXb/LfiFsICSyKSTpu0yqBSeS8h0zefRTOOGmUeBe8lEMz8nPB
         thWMWg03F5sRlO6h4SVWzYA3502otvgIeMchZiTjNHDEfeVL3fpKnqjZEqMWuKj1XBAn
         wZ9Q==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1701995045; x=1702599845;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc
         :subject:date:message-id:reply-to;
        bh=1MICklUThOT75YrGOtGs5vvoxKkTP4eSHGLbFICqC1g=;
        b=cOOmdzUzanQHO4S0wI1VPrnnc7UzT5WmBznYay+5S20vwKJ25hENq6HjWWMC1qIFHR
         qg67p2I92HhQTofWa917rrFB5V9KKCJ3lj/X3qyVjv0h5NoCiQ8W6jHaCsyiYQcGQQs8
         kloPx9x5WtU8sorVvsBZEsuWqsXRIVOfapp2DnH4O/5wPt1Hdmw4Nk9k+JDnGyHlAG3w
         tlivEArp9WhtIpRUjATUMqpqgEfVRsO+O7X9qPpuJ1Wsu7h02Jlf+fgXtKT96eczBxAx
         5tZqfXkY5U+//wJJSkyes88N2cs7JNlDKlVS2Vriv3P3lhxsYU/FY6L4FDpb+yEWMOn6
         9TyA==
X-Gm-Message-State: AOJu0Yw8l1owcCxvlTY0HFNp5rTjn9dmEmv95h3e/Gt64jfKsAW0g9Qr
        CWx3tiubKxjxh45tGTq1BZbodg==
X-Google-Smtp-Source: 
 AGHT+IF6n9vGF+wKVNWmclAiRMBztZoqnbj8D/4FxMtbCN6r+z66nQHUenjbCO2tFAE1xQOoGOdHMA==
X-Received: by 2002:a05:600c:3d0b:b0:40c:314c:803e with SMTP id
 bh11-20020a05600c3d0b00b0040c314c803emr435922wmb.106.1701995045591;
        Thu, 07 Dec 2023 16:24:05 -0800 (PST)
Received: from airbuntu.. (host109-153-232-45.range109-153.btcentralplus.com.
 [109.153.232.45])
        by smtp.gmail.com with ESMTPSA id
 u17-20020a05600c19d100b0040c1c269264sm3339653wmq.40.2023.12.07.16.24.04
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Thu, 07 Dec 2023 16:24:05 -0800 (PST)
From: Qais Yousef <qyousef@layalina.io>
To: Ingo Molnar <mingo@kernel.org>,
        Peter Zijlstra <peterz@infradead.org>,
        "Rafael J. Wysocki" <rafael@kernel.org>,
        Viresh Kumar <viresh.kumar@linaro.org>,
        Vincent Guittot <vincent.guittot@linaro.org>,
        Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org,
        Lukasz Luba <lukasz.luba@arm.com>, Wei Wang <wvw@google.com>,
        Rick Yiu <rickyiu@google.com>,
        Chung-Kai Mei <chungkai@google.com>,
        Qais Yousef <qyousef@layalina.io>
Subject: [PATCH v2 2/8] sched: cpufreq: Rename map_util_perf to
 apply_dvfs_headroom
Date: Fri,  8 Dec 2023 00:23:36 +0000
Message-Id: <20231208002342.367117-3-qyousef@layalina.io>
X-Mailer: git-send-email 2.34.1
In-Reply-To: <20231208002342.367117-1-qyousef@layalina.io>
References: <20231208002342.367117-1-qyousef@layalina.io>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Type: text/plain; charset="utf-8"

We are providing headroom for the utilization to grow until the next
decision point to pick the next frequency. Give the function a better
name and give it some documentation. It is not really mapping anything.

Also move it to sched.h. This function relies on updating util signal
appropriately to give a headroom to grow. This is more of a scheduler
functionality than cpufreq. Move it to sched.h where all the other util
handling code belongs.

Signed-off-by: Qais Yousef (Google) <qyousef@layalina.io>
---
 include/linux/sched/cpufreq.h    |  5 -----
 kernel/sched/cpufreq_schedutil.c |  2 +-
 kernel/sched/sched.h             | 17 +++++++++++++++++
 3 files changed, 18 insertions(+), 6 deletions(-)

diff --git a/include/linux/sched/cpufreq.h b/include/linux/sched/cpufreq.h
index bdd31ab93bc5..d01755d3142f 100644
--- a/include/linux/sched/cpufreq.h
+++ b/include/linux/sched/cpufreq.h
@@ -28,11 +28,6 @@ static inline unsigned long map_util_freq(unsigned long =
util,
 {
 	return freq * util / cap;
 }
-
-static inline unsigned long map_util_perf(unsigned long util)
-{
-	return util + (util >> 2);
-}
 #endif /* CONFIG_CPU_FREQ */
=20
 #endif /* _LINUX_SCHED_CPUFREQ_H */
diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedu=
til.c
index 4ee8ad70be99..79c3b96dc02c 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -157,7 +157,7 @@ unsigned long sugov_effective_cpu_perf(int cpu, unsigne=
d long actual,
 				 unsigned long max)
 {
 	/* Add dvfs headroom to actual utilization */
-	actual =3D map_util_perf(actual);
+	actual =3D apply_dvfs_headroom(actual);
 	/* Actually we don't need to target the max performance */
 	if (actual < max)
 		max =3D actual;
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index e58a54bda77d..0da3425200b1 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -3002,6 +3002,23 @@ unsigned long sugov_effective_cpu_perf(int cpu, unsi=
gned long actual,
 				 unsigned long min,
 				 unsigned long max);
=20
+/*
+ * DVFS decision are made at discrete points. If CPU stays busy, the util =
will
+ * continue to grow, which means it could need to run at a higher frequency
+ * before the next decision point was reached. IOW, we can't follow the ut=
il as
+ * it grows immediately, but there's a delay before we issue a request to =
go to
+ * higher frequency. The headroom caters for this delay so the system cont=
inues
+ * to run at adequate performance point.
+ *
+ * This function provides enough headroom to provide adequate performance
+ * assuming the CPU continues to be busy.
+ *
+ * At the moment it is a constant multiplication with 1.25.
+ */
+static inline unsigned long apply_dvfs_headroom(unsigned long util)
+{
+	return util + (util >> 2);
+}
=20
 /*
  * Verify the fitness of task @p to run on @cpu taking into account the
--=20
2.34.1
From nobody Mon Feb  9 16:23:52 2026
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 5652BC4167B
	for <linux-kernel@archiver.kernel.org>; Fri,  8 Dec 2023 00:24:20 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1444222AbjLHAYL (ORCPT
        <rfc822;linux-kernel@archiver.kernel.org>);
        Thu, 7 Dec 2023 19:24:11 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46714 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1444166AbjLHAYD (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Thu, 7 Dec 2023 19:24:03 -0500
Received: from mail-wm1-x32f.google.com (mail-wm1-x32f.google.com
 [IPv6:2a00:1450:4864:20::32f])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6AC9CD5B
        for <linux-kernel@vger.kernel.org>;
 Thu,  7 Dec 2023 16:24:09 -0800 (PST)
Received: by mail-wm1-x32f.google.com with SMTP id
 5b1f17b1804b1-40b595bf5d2so18063345e9.2
        for <linux-kernel@vger.kernel.org>;
 Thu, 07 Dec 2023 16:24:09 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=layalina-io.20230601.gappssmtp.com; s=20230601; t=1701995047;
 x=1702599847; darn=vger.kernel.org;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:from:to:cc:subject:date
         :message-id:reply-to;
        bh=4hmu4GR1Tf24zDfNQg/5SjNdB/QFzSeUZxk8HyYC52U=;
        b=x+BhzZ6VtGPBQFltDUsIdkVfWUh1AG8Qdn8SLD2E/8UQ7ee58SP79CSpA/Pv4BVVh5
         rpZvTDUQo5GjiQ2GTc9lJx5hTiIOHuK9D7VjixxgTDFBNdLi3vNzmVXD4f1Yjng5mE98
         Q4POqOxubMOO8smHCX2dGZYTCZVz3qi3AUBvrkjFygVZFpIkVbgaQ92gJHXHVZpx6uyA
         xPAzbfpgOK6aXQaVxbd1hIqL4mYXL9wMiNjA2zLTq/60dthzOj+qTlsE3lHGaX2zTyCI
         hzHYUM7TqpGUCctXCu5IdF+ugivfruAgOEwQgOIQrFrSIGxQ5yeGWEaFDvMHqwVML1sW
         GjVg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1701995047; x=1702599847;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc
         :subject:date:message-id:reply-to;
        bh=4hmu4GR1Tf24zDfNQg/5SjNdB/QFzSeUZxk8HyYC52U=;
        b=IK78Mh3hW4zy/AKZcb8Xz8usga3DD7M9fwsy6BvMIWllUq+20/10yhfyqLitgf+F+F
         z6yjza1w3q7Gcsw1SSPEj1IHBMJ+no2rIOJoF0IvUeP7Rl2IFhdavbtcUwuSPGNlcZJ5
         WB7mvZn28l4NlvnuQdTbyZWzSMYVvxtql3PIdUSkp4bdzglcIjOrVHEW8HH0K1PEYVNE
         92dl39V4bKDkgrhjaMGIYg+P/B53IRORgCtP8o5SrEGekWgMh5Q/atrIGvSu79OeYpm0
         P7AUV1GVU+JvpXPcUZfwq0Q508o95Yv2gEglTv1kN5lwdgiQmCcw/bnHXCT3ca0MSIoL
         z34w==
X-Gm-Message-State: AOJu0YzbVSEMwXG85uU0yyVDOyBs1YMtdll9Cgt4ZZrx5B16uGgPZ8z7
        0XfmQJPOvCC/A//K1B6eUQrAAg==
X-Google-Smtp-Source: 
 AGHT+IEDkE6xFmQ45NVfGlGxefY7HBAhSTWDgMehJ17Ql0RmMpqkVzJDaZNH0S70STgK37Gtbrc9Yg==
X-Received: by 2002:a05:600c:a388:b0:40c:2c51:d78b with SMTP id
 hn8-20020a05600ca38800b0040c2c51d78bmr522596wmb.209.1701995047752;
        Thu, 07 Dec 2023 16:24:07 -0800 (PST)
Received: from airbuntu.. (host109-153-232-45.range109-153.btcentralplus.com.
 [109.153.232.45])
        by smtp.gmail.com with ESMTPSA id
 u17-20020a05600c19d100b0040c1c269264sm3339653wmq.40.2023.12.07.16.24.06
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Thu, 07 Dec 2023 16:24:07 -0800 (PST)
From: Qais Yousef <qyousef@layalina.io>
To: Ingo Molnar <mingo@kernel.org>,
        Peter Zijlstra <peterz@infradead.org>,
        "Rafael J. Wysocki" <rafael@kernel.org>,
        Viresh Kumar <viresh.kumar@linaro.org>,
        Vincent Guittot <vincent.guittot@linaro.org>,
        Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org,
        Lukasz Luba <lukasz.luba@arm.com>, Wei Wang <wvw@google.com>,
        Rick Yiu <rickyiu@google.com>,
        Chung-Kai Mei <chungkai@google.com>,
        Qais Yousef <qyousef@layalina.io>
Subject: [PATCH v2 3/8] sched/pelt: Add a new function to approximate the
 future util_avg value
Date: Fri,  8 Dec 2023 00:23:37 +0000
Message-Id: <20231208002342.367117-4-qyousef@layalina.io>
X-Mailer: git-send-email 2.34.1
In-Reply-To: <20231208002342.367117-1-qyousef@layalina.io>
References: <20231208002342.367117-1-qyousef@layalina.io>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Type: text/plain; charset="utf-8"

Given a util_avg value, the new function will return the future one
given a runtime delta.

This will be useful in later patches to help replace some magic margins
with more deterministic behavior.

Signed-off-by: Qais Yousef (Google) <qyousef@layalina.io>
---
 kernel/sched/pelt.c  | 22 +++++++++++++++++++++-
 kernel/sched/sched.h |  2 ++
 2 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/pelt.c b/kernel/sched/pelt.c
index 63b6cf898220..81555a8288be 100644
--- a/kernel/sched/pelt.c
+++ b/kernel/sched/pelt.c
@@ -466,4 +466,24 @@ int update_irq_load_avg(struct rq *rq, u64 running)
=20
 	return ret;
 }
-#endif
+#endif /* CONFIG_HAVE_SCHED_AVG_IRQ */
+
+/*
+ * Approximate the new util_avg value assuming an entity has continued to =
run
+ * for @delta us.
+ */
+unsigned long approximate_util_avg(unsigned long util, u64 delta)
+{
+	struct sched_avg sa =3D {
+		.util_sum =3D util * PELT_MIN_DIVIDER,
+		.util_avg =3D util,
+	};
+
+	if (unlikely(!delta))
+		return util;
+
+	accumulate_sum(delta, &sa, 1, 0, 1);
+	___update_load_avg(&sa, 0);
+
+	return sa.util_avg;
+}
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 0da3425200b1..7e5a86a376f8 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -3002,6 +3002,8 @@ unsigned long sugov_effective_cpu_perf(int cpu, unsig=
ned long actual,
 				 unsigned long min,
 				 unsigned long max);
=20
+unsigned long approximate_util_avg(unsigned long util, u64 delta);
+
 /*
  * DVFS decision are made at discrete points. If CPU stays busy, the util =
will
  * continue to grow, which means it could need to run at a higher frequency
--=20
2.34.1
From nobody Mon Feb  9 16:23:52 2026
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 808EEC4167B
	for <linux-kernel@archiver.kernel.org>; Fri,  8 Dec 2023 00:24:23 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1444210AbjLHAYO (ORCPT
        <rfc822;linux-kernel@archiver.kernel.org>);
        Thu, 7 Dec 2023 19:24:14 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46844 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1444198AbjLHAYK (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Thu, 7 Dec 2023 19:24:10 -0500
Received: from mail-wm1-x334.google.com (mail-wm1-x334.google.com
 [IPv6:2a00:1450:4864:20::334])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0D4B4172A
        for <linux-kernel@vger.kernel.org>;
 Thu,  7 Dec 2023 16:24:13 -0800 (PST)
Received: by mail-wm1-x334.google.com with SMTP id
 5b1f17b1804b1-40c0a0d068bso15484285e9.3
        for <linux-kernel@vger.kernel.org>;
 Thu, 07 Dec 2023 16:24:12 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=layalina-io.20230601.gappssmtp.com; s=20230601; t=1701995051;
 x=1702599851; darn=vger.kernel.org;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:from:to:cc:subject:date
         :message-id:reply-to;
        bh=JFcvWyGzy7I7g+xwi5n944IlgZ7CY3p5798f0wmNhcc=;
        b=cdX5NWyWCBwXaoh8OWBzi6gu+rNuV2bBH/2eVclyphUVCJIi3HYqdO3+95cxomldmM
         S0Ly4YLHh5kVQSxqv7ZXyy7X/NurSo4a/NGpcPwKNB6xvFOM/8v5IoA7TUsiw7Uo4/bt
         NFjNhGzQl28Aa2NZ7ZDmfynSzxWVTeBnN7GRIvNF0gHY1Jc/uM1jQzd5fNmqkltJ3NRc
         OS/5OwLvc+0dExB8HOUUkeQ4XHRofF2IOBCUvyqO8DGvztAYpK5ssac1mMHRohib93Fy
         7eVDJSYd4XabZkagv2hicmktQDoHYZ488qhIAHYmSlIfSmiWTVUYNLQZC0RH8S+I2zQP
         tAkg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1701995051; x=1702599851;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc
         :subject:date:message-id:reply-to;
        bh=JFcvWyGzy7I7g+xwi5n944IlgZ7CY3p5798f0wmNhcc=;
        b=B78kfLfYYToQlyJWqf6x/jKLXLpRJISwE2VfeV5Gt4RD8HamsNC+3NdnaDY6/Kp9xA
         HMTpb1hStLQQgbxTq3TpMzTn1cNnBfp2XoweM7YpwfU4RXnzhfr7oW1fKcWDl7GFGhRJ
         CTtR9TiaZzvm2tpQnJVxks0B/YSI1iiXUjQz1tdoBH4K8zo7EnQqBi8/5jlFHtRBhM8n
         MyEhb58U5Bho1X+Rys7x5ISaDw8ZZefhAjibrpyFwv67UdXqP7pk0neAf6Ze6tiPf0pV
         CBojqolTSLHirUuMy0C9eAhMAuSfdhAxoHVL23aD7TvpCFrVAuVnA7PpRqYj65kFSDLW
         1xjQ==
X-Gm-Message-State: AOJu0YyY3QHEc8Tx1LjOqKSECe9VgjQeb6jlOsVh/hhNeFkLauhSglL2
        88dShau1IMyG8w6sP5/Zp4Gufw==
X-Google-Smtp-Source: 
 AGHT+IFNv/bJ3z7++RWJr3sps726MqohP1NGWMmNXo4tcWhp9NOoexZrvOdWJYl26EZFZZEDpZZ4qw==
X-Received: by 2002:a05:600c:3c86:b0:40c:7ef:2d2 with SMTP id
 bg6-20020a05600c3c8600b0040c07ef02d2mr2224939wmb.8.1701995051136;
        Thu, 07 Dec 2023 16:24:11 -0800 (PST)
Received: from airbuntu.. (host109-153-232-45.range109-153.btcentralplus.com.
 [109.153.232.45])
        by smtp.gmail.com with ESMTPSA id
 u17-20020a05600c19d100b0040c1c269264sm3339653wmq.40.2023.12.07.16.24.10
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Thu, 07 Dec 2023 16:24:10 -0800 (PST)
From: Qais Yousef <qyousef@layalina.io>
To: Ingo Molnar <mingo@kernel.org>,
        Peter Zijlstra <peterz@infradead.org>,
        "Rafael J. Wysocki" <rafael@kernel.org>,
        Viresh Kumar <viresh.kumar@linaro.org>,
        Vincent Guittot <vincent.guittot@linaro.org>,
        Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org,
        Lukasz Luba <lukasz.luba@arm.com>, Wei Wang <wvw@google.com>,
        Rick Yiu <rickyiu@google.com>,
        Chung-Kai Mei <chungkai@google.com>,
        Qais Yousef <qyousef@layalina.io>
Subject: [PATCH v2 4/8] sched/pelt: Add a new function to approximate runtime
 to reach given util
Date: Fri,  8 Dec 2023 00:23:38 +0000
Message-Id: <20231208002342.367117-5-qyousef@layalina.io>
X-Mailer: git-send-email 2.34.1
In-Reply-To: <20231208002342.367117-1-qyousef@layalina.io>
References: <20231208002342.367117-1-qyousef@layalina.io>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Type: text/plain; charset="utf-8"

It is basically the ramp-up time from 0 to a given value. Will be used
later to implement new tunable to control response time  for schedutil.

Signed-off-by: Qais Yousef (Google) <qyousef@layalina.io>
---
 kernel/sched/pelt.c  | 21 +++++++++++++++++++++
 kernel/sched/sched.h |  1 +
 2 files changed, 22 insertions(+)

diff --git a/kernel/sched/pelt.c b/kernel/sched/pelt.c
index 81555a8288be..00a1b9c1bf16 100644
--- a/kernel/sched/pelt.c
+++ b/kernel/sched/pelt.c
@@ -487,3 +487,24 @@ unsigned long approximate_util_avg(unsigned long util,=
 u64 delta)
=20
 	return sa.util_avg;
 }
+
+/*
+ * Approximate the required amount of runtime in ms required to reach @uti=
l.
+ */
+u64 approximate_runtime(unsigned long util)
+{
+	struct sched_avg sa =3D {};
+	u64 delta =3D 1024; // period =3D 1024 =3D ~1ms
+	u64 runtime =3D 0;
+
+	if (unlikely(!util))
+		return runtime;
+
+	while (sa.util_avg < util) {
+		accumulate_sum(delta, &sa, 1, 0, 1);
+		___update_load_avg(&sa, 0);
+		runtime++;
+	}
+
+	return runtime;
+}
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 7e5a86a376f8..2de64f59853c 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -3003,6 +3003,7 @@ unsigned long sugov_effective_cpu_perf(int cpu, unsig=
ned long actual,
 				 unsigned long max);
=20
 unsigned long approximate_util_avg(unsigned long util, u64 delta);
+u64 approximate_runtime(unsigned long util);
=20
 /*
  * DVFS decision are made at discrete points. If CPU stays busy, the util =
will
--=20
2.34.1
From nobody Mon Feb  9 16:23:52 2026
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 85D97C4167B
	for <linux-kernel@archiver.kernel.org>; Fri,  8 Dec 2023 00:24:29 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S235538AbjLHAYU (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Thu, 7 Dec 2023 19:24:20 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46868 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1444197AbjLHAYL (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Thu, 7 Dec 2023 19:24:11 -0500
Received: from mail-wm1-x332.google.com (mail-wm1-x332.google.com
 [IPv6:2a00:1450:4864:20::332])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2DB3C1980
        for <linux-kernel@vger.kernel.org>;
 Thu,  7 Dec 2023 16:24:15 -0800 (PST)
Received: by mail-wm1-x332.google.com with SMTP id
 5b1f17b1804b1-40c07ed92fdso15920595e9.3
        for <linux-kernel@vger.kernel.org>;
 Thu, 07 Dec 2023 16:24:15 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=layalina-io.20230601.gappssmtp.com; s=20230601; t=1701995053;
 x=1702599853; darn=vger.kernel.org;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:from:to:cc:subject:date
         :message-id:reply-to;
        bh=mRq0Syt1SvsKafd3cs6Wfp7cr+i7CiNOFmpcV8u9otI=;
        b=usHAvpdI+8BCDpXTPlDZj94N2uiO7C6T7X3m780XxcxIEBe7thVy3r/tY18K2eIBkw
         0kLzajxwNDRDUV60iOQP6QsW3my9+elNyPbzebQ+5CdPZ7QBhfUjs7nP2gXL2fmvtXQS
         kGQkNIYhbAk4/9MGQBcNUZ6MLVQw0oryFjW9rR/d+7SSgaKa5mND9+zaOmAkGIHMTx84
         vzBHY+nAMRBq6k1whTJtua0IYtj7ogL9hcOU/ERm55T+L3TZ1bjgqxKJGeQW0ej/eAcr
         p/KVXJ867hybCpZr8NZ80qJuam2T68rf6wUCktNVFSvYG09KfcLPJtSMhkgzhlgT9kIT
         Qnyw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1701995053; x=1702599853;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc
         :subject:date:message-id:reply-to;
        bh=mRq0Syt1SvsKafd3cs6Wfp7cr+i7CiNOFmpcV8u9otI=;
        b=JvEX1eqiSXDh7JF7vRFFIERJoBQxqLLtKSERsp7Ak/tP2wDhwm0zfOYV9MeByK5HN1
         HBFr/886qcGSb9WKU5DzSvSuCi8hYuqmsaWZGyNvOtlvQwqB18RjyraAJmETB4KTb24d
         Sy/gUGd7S2F6VsxcaVerbmTrWezM3Si2KsPyCjnqVYsdMRtqeGp8npKGqq+UXcHfeZdq
         9+xMwFLyEgtx5B4yjUJYz9JMBhE6qBhhJFihSWkPHPFxB/a1SVeFwXfAefTXni+QN2+a
         me/IYtZuMVhGTGUx+geYfIpEyzYDY3Q+wZhhbhKuV/pnPXRbq+EssS5hiXmgNTvKHuNc
         MxJQ==
X-Gm-Message-State: AOJu0YwOcBX35jucje5723QzZl3UKNIuu4DKLs7d5MxeuSDdd02aRTYw
        3bH6kRMmjv+N7C7OH9jBr3+ASg==
X-Google-Smtp-Source: 
 AGHT+IGzt9IqJG3ZpXpJIZHhOEc8F+mqRgQdm8BQLp3/peB0eQhUaZSO2ows8w6zmPv1lRO1zF36+g==
X-Received: by 2002:a05:600c:45d2:b0:40b:32fa:d8a3 with SMTP id
 s18-20020a05600c45d200b0040b32fad8a3mr2019599wmo.18.1701995053212;
        Thu, 07 Dec 2023 16:24:13 -0800 (PST)
Received: from airbuntu.. (host109-153-232-45.range109-153.btcentralplus.com.
 [109.153.232.45])
        by smtp.gmail.com with ESMTPSA id
 u17-20020a05600c19d100b0040c1c269264sm3339653wmq.40.2023.12.07.16.24.12
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Thu, 07 Dec 2023 16:24:12 -0800 (PST)
From: Qais Yousef <qyousef@layalina.io>
To: Ingo Molnar <mingo@kernel.org>,
        Peter Zijlstra <peterz@infradead.org>,
        "Rafael J. Wysocki" <rafael@kernel.org>,
        Viresh Kumar <viresh.kumar@linaro.org>,
        Vincent Guittot <vincent.guittot@linaro.org>,
        Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org,
        Lukasz Luba <lukasz.luba@arm.com>, Wei Wang <wvw@google.com>,
        Rick Yiu <rickyiu@google.com>,
        Chung-Kai Mei <chungkai@google.com>,
        Qais Yousef <qyousef@layalina.io>
Subject: [PATCH v2 5/8] sched/fair: Remove magic hardcoded margin in
 fits_capacity()
Date: Fri,  8 Dec 2023 00:23:39 +0000
Message-Id: <20231208002342.367117-6-qyousef@layalina.io>
X-Mailer: git-send-email 2.34.1
In-Reply-To: <20231208002342.367117-1-qyousef@layalina.io>
References: <20231208002342.367117-1-qyousef@layalina.io>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Type: text/plain; charset="utf-8"

Replace hardcoded margin value in fits_capacity() with better dynamic
logic.

80% margin is a magic value that has served its purpose for now, but it
no longer fits the variety of systems exist today. If a system is over
powered specifically, this 80% will mean we leave a lot of capacity
unused before we decide to upmigrate on HMP system.

On some systems the little core are under powered and ability to migrate
faster away from them is desired.

The upmigration behavior should rely on the fact that a bad decision
made will need load balance to kick in to perform misfit migration. And
I think this is an adequate definition for what to consider as enough
headroom to consider whether a util fits capacity or not.

Use the new approximate_util_avg() function to predict the util if the
task continues to run for TICK_US. If the value is not strictly less
than the capacity, then it must not be placed there, ie considered
misfit.

Signed-off-by: Qais Yousef (Google) <qyousef@layalina.io>
---
 kernel/sched/fair.c | 21 ++++++++++++++++++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index bcea3d55d95d..b83448be3f79 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -101,16 +101,31 @@ int __weak arch_asym_cpu_priority(int cpu)
 }
=20
 /*
- * The margin used when comparing utilization with CPU capacity.
+ * The util will fit the capacity if it has enough headroom to grow within=
 the
+ * next tick - which is when any load balancing activity happens to do the
+ * correction.
  *
- * (default: ~20%)
+ * If util stays within the capacity before tick has elapsed, then it shou=
ld be
+ * fine. If not, then a correction action must happen shortly after it sta=
rts
+ * running, hence we treat it as !fit.
+ *
+ * TODO: TICK is not actually accurate enough. balance_interval is the cor=
rect
+ * one to use as the next load balance doesn't not happen religiously at t=
ick.
+ * Accessing balance_interval might be tricky and will require some refact=
oring
+ * first.
  */
-#define fits_capacity(cap, max)	((cap) * 1280 < (max) * 1024)
+static inline bool fits_capacity(unsigned long util, unsigned long capacit=
y)
+{
+	return approximate_util_avg(util, TICK_USEC) < capacity;
+}
=20
 /*
  * The margin used when comparing CPU capacities.
  * is 'cap1' noticeably greater than 'cap2'
  *
+ * TODO: use approximate_util_avg() to give something more quantifiable ba=
sed
+ * on time? Like 1ms?
+ *
  * (default: ~5%)
  */
 #define capacity_greater(cap1, cap2) ((cap1) * 1024 > (cap2) * 1078)
--=20
2.34.1
From nobody Mon Feb  9 16:23:52 2026
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 5946DC4167B
	for <linux-kernel@archiver.kernel.org>; Fri,  8 Dec 2023 00:24:34 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S235622AbjLHAYY (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Thu, 7 Dec 2023 19:24:24 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46890 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1444202AbjLHAYM (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Thu, 7 Dec 2023 19:24:12 -0500
Received: from mail-wr1-x436.google.com (mail-wr1-x436.google.com
 [IPv6:2a00:1450:4864:20::436])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 471811990
        for <linux-kernel@vger.kernel.org>;
 Thu,  7 Dec 2023 16:24:17 -0800 (PST)
Received: by mail-wr1-x436.google.com with SMTP id
 ffacd0b85a97d-3335397607dso1868493f8f.1
        for <linux-kernel@vger.kernel.org>;
 Thu, 07 Dec 2023 16:24:17 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=layalina-io.20230601.gappssmtp.com; s=20230601; t=1701995056;
 x=1702599856; darn=vger.kernel.org;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:from:to:cc:subject:date
         :message-id:reply-to;
        bh=0nwlq0zhFu5FAStt4KBjg7mmh2j1Bzk1RkbICiq0xto=;
        b=OMNtVLE8/L55t5RGpgoTuYD5hxcyHPgVChQ3+c9sHAS8QuIcAzROMrVPzL2nH5P74V
         O6EkctcoMVX8q1HOxPP1CqZZXp+gAQ5PwhBzs2T1AdK3yyGZ7DrplrlwwlCj7Tut+O/K
         kG/k03zv8NL9cxG1nylXz05v1lMjSlQ2YffIumgm5HAwedQf6AeQxlXyF9ZeHf9OcEk8
         dSNToUI4qfS0/Co6X6FgQQidon1j2XdH86XESsCn0P/lePbMhxxstB+3bj0B8Ofcobww
         sehJjTlBxV8lOU7E0U0lqgcvAbfk7N0G7+CesMsVsj1aYwmCNI5BiN1D8Wq51s0GfIuQ
         GTaA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1701995056; x=1702599856;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc
         :subject:date:message-id:reply-to;
        bh=0nwlq0zhFu5FAStt4KBjg7mmh2j1Bzk1RkbICiq0xto=;
        b=Uofr9N2ltuBE4SK1NW+JyJMqWIE76z5eJMYeNDEm8bKwiPWv1RDrbyoPspvU3yAcyd
         TnA6pa2psH9RhmI9zyH1P603Gpya+bLPrV3mlSYIq5S0ziAhPaLt4Nv5wIBAZ1R16NLS
         J/qpdoCs8XZwNvqYPvG5I6C2hPj3ApfGJzsETHjdv2EHo01Ugx+5EpdPDs/eJviGz9az
         Fb/+7hRZy5mLLIk1iLVeyebu2/nPRyiId3VwoyTTVn1Xx2vpaEhlmB7TfJ3mx/2YLikq
         7m+dNshXOk47SMAtOYD5/ILNzgO3ZCSajJzKmmEOnAY1ZzCmTkRFzUdN9BdPc6uN0hBv
         RzAA==
X-Gm-Message-State: AOJu0Yy1+UEP10aSmC2nGqMEJNwxzJR9ieEEpRAev67kI72spiIU3E4w
        bf/aV2T5YncTqPoCYyEoZNh6+w==
X-Google-Smtp-Source: 
 AGHT+IGaBM/YAzJsink77cXmt66LEcA7F6asWDoic+NYkY33uXpybFXXPDoXowvaweViU1Heh3EQ4w==
X-Received: by 2002:a05:600c:45cf:b0:401:bdd7:49ae with SMTP id
 s15-20020a05600c45cf00b00401bdd749aemr1943026wmo.18.1701995055118;
        Thu, 07 Dec 2023 16:24:15 -0800 (PST)
Received: from airbuntu.. (host109-153-232-45.range109-153.btcentralplus.com.
 [109.153.232.45])
        by smtp.gmail.com with ESMTPSA id
 u17-20020a05600c19d100b0040c1c269264sm3339653wmq.40.2023.12.07.16.24.14
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Thu, 07 Dec 2023 16:24:14 -0800 (PST)
From: Qais Yousef <qyousef@layalina.io>
To: Ingo Molnar <mingo@kernel.org>,
        Peter Zijlstra <peterz@infradead.org>,
        "Rafael J. Wysocki" <rafael@kernel.org>,
        Viresh Kumar <viresh.kumar@linaro.org>,
        Vincent Guittot <vincent.guittot@linaro.org>,
        Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org,
        Lukasz Luba <lukasz.luba@arm.com>, Wei Wang <wvw@google.com>,
        Rick Yiu <rickyiu@google.com>,
        Chung-Kai Mei <chungkai@google.com>,
        Qais Yousef <qyousef@layalina.io>
Subject: [PATCH v2 6/8] sched: cpufreq: Remove magic 1.25 headroom from
 apply_dvfs_headroom()
Date: Fri,  8 Dec 2023 00:23:40 +0000
Message-Id: <20231208002342.367117-7-qyousef@layalina.io>
X-Mailer: git-send-email 2.34.1
In-Reply-To: <20231208002342.367117-1-qyousef@layalina.io>
References: <20231208002342.367117-1-qyousef@layalina.io>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Type: text/plain; charset="utf-8"

Replace 1.25 headroom in apply_dvfs_headroom() with better dynamic
logic.

Instead of the magical 1.25 headroom, use the new approximate_util_avg()
to provide headroom based on the dvfs_update_delay; which is the period
at which the cpufreq governor will send DVFS updates to the hardware.

Add a new percpu dvfs_update_delay that can be cheaply accessed whenever
apply_dvfs_headroom() is called. We expect cpufreq governors that rely
on util to drive its DVFS logic/algorithm to populate these percpu
variables. schedutil is the only such governor at the moment.

The behavior of schedutil will change as the headroom will be less than
1.25 for most systems as the rate_limit_us is usually short.

Signed-off-by: Qais Yousef (Google) <qyousef@layalina.io>
---
 kernel/sched/core.c              |  1 +
 kernel/sched/cpufreq_schedutil.c | 13 +++++++++++--
 kernel/sched/sched.h             | 18 ++++++++++++++----
 3 files changed, 26 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index db4be4921e7f..b4a1c8ea9e12 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -116,6 +116,7 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(sched_update_nr_running_tp=
);
 EXPORT_TRACEPOINT_SYMBOL_GPL(sched_compute_energy_tp);
=20
 DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
+DEFINE_PER_CPU_READ_MOSTLY(u64, dvfs_update_delay);
=20
 #ifdef CONFIG_SCHED_DEBUG
 /*
diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedu=
til.c
index 79c3b96dc02c..1d4d6025c15f 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -157,7 +157,7 @@ unsigned long sugov_effective_cpu_perf(int cpu, unsigne=
d long actual,
 				 unsigned long max)
 {
 	/* Add dvfs headroom to actual utilization */
-	actual =3D apply_dvfs_headroom(actual);
+	actual =3D apply_dvfs_headroom(actual, cpu);
 	/* Actually we don't need to target the max performance */
 	if (actual < max)
 		max =3D actual;
@@ -535,15 +535,21 @@ rate_limit_us_store(struct gov_attr_set *attr_set, co=
nst char *buf, size_t count
 	struct sugov_tunables *tunables =3D to_sugov_tunables(attr_set);
 	struct sugov_policy *sg_policy;
 	unsigned int rate_limit_us;
+	int cpu;
=20
 	if (kstrtouint(buf, 10, &rate_limit_us))
 		return -EINVAL;
=20
 	tunables->rate_limit_us =3D rate_limit_us;
=20
-	list_for_each_entry(sg_policy, &attr_set->policy_list, tunables_hook)
+	list_for_each_entry(sg_policy, &attr_set->policy_list, tunables_hook) {
+
 		sg_policy->freq_update_delay_ns =3D rate_limit_us * NSEC_PER_USEC;
=20
+		for_each_cpu(cpu, sg_policy->policy->cpus)
+			per_cpu(dvfs_update_delay, cpu) =3D rate_limit_us;
+	}
+
 	return count;
 }
=20
@@ -824,6 +830,9 @@ static int sugov_start(struct cpufreq_policy *policy)
 		memset(sg_cpu, 0, sizeof(*sg_cpu));
 		sg_cpu->cpu =3D cpu;
 		sg_cpu->sg_policy =3D sg_policy;
+
+		per_cpu(dvfs_update_delay, cpu) =3D sg_policy->tunables->rate_limit_us;
+
 		cpufreq_add_update_util_hook(cpu, &sg_cpu->update_util, uu);
 	}
 	return 0;
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 2de64f59853c..bbece0eb053a 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -3005,6 +3005,15 @@ unsigned long sugov_effective_cpu_perf(int cpu, unsi=
gned long actual,
 unsigned long approximate_util_avg(unsigned long util, u64 delta);
 u64 approximate_runtime(unsigned long util);
=20
+/*
+ * Any governor that relies on util signal to drive DVFS, must populate th=
ese
+ * percpu dvfs_update_delay variables.
+ *
+ * It should describe the rate/delay at which the governor sends DVFS freq
+ * update to the hardware in us.
+ */
+DECLARE_PER_CPU_READ_MOSTLY(u64, dvfs_update_delay);
+
 /*
  * DVFS decision are made at discrete points. If CPU stays busy, the util =
will
  * continue to grow, which means it could need to run at a higher frequency
@@ -3014,13 +3023,14 @@ u64 approximate_runtime(unsigned long util);
  * to run at adequate performance point.
  *
  * This function provides enough headroom to provide adequate performance
- * assuming the CPU continues to be busy.
+ * assuming the CPU continues to be busy. This headroom is based on the
+ * dvfs_update_delay of the cpufreq governor.
  *
- * At the moment it is a constant multiplication with 1.25.
+ * XXX: Should we provide headroom when the util is decaying?
  */
-static inline unsigned long apply_dvfs_headroom(unsigned long util)
+static inline unsigned long apply_dvfs_headroom(unsigned long util, int cp=
u)
 {
-	return util + (util >> 2);
+	return approximate_util_avg(util, per_cpu(dvfs_update_delay, cpu));
 }
=20
 /*
--=20
2.34.1
From nobody Mon Feb  9 16:23:52 2026
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id EF2DDC4167B
	for <linux-kernel@archiver.kernel.org>; Fri,  8 Dec 2023 00:24:38 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1444230AbjLHAYa (ORCPT
        <rfc822;linux-kernel@archiver.kernel.org>);
        Thu, 7 Dec 2023 19:24:30 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46760 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1444218AbjLHAYS (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Thu, 7 Dec 2023 19:24:18 -0500
Received: from mail-wr1-x42c.google.com (mail-wr1-x42c.google.com
 [IPv6:2a00:1450:4864:20::42c])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 872E61985
        for <linux-kernel@vger.kernel.org>;
 Thu,  7 Dec 2023 16:24:19 -0800 (PST)
Received: by mail-wr1-x42c.google.com with SMTP id
 ffacd0b85a97d-3333074512bso1407894f8f.1
        for <linux-kernel@vger.kernel.org>;
 Thu, 07 Dec 2023 16:24:19 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=layalina-io.20230601.gappssmtp.com; s=20230601; t=1701995057;
 x=1702599857; darn=vger.kernel.org;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:from:to:cc:subject:date
         :message-id:reply-to;
        bh=ubzGMtQt5HVPM7AOlBax8stZx2xSHeRHEpRUO663LjQ=;
        b=ytQCuWqfn/HW++/glLQlEYIIfb7zZvEc0wm67fi4eviyLHM2W/Be9Df4QpspGPjgpR
         p+ibl2cycN6gT9cKOSrKxSVmRLZIuUVMkklYIYSjmSmORvKijV9yR6ak2DuH01hD3Iqe
         s4ikXZImNcw9ze2dbcR/mFrJKLBa1POrrWvoK2z3l9oi5fQmWkxoQmvJUyiRTFhhgqbs
         0Xfzp7HQ1M8eU2BnJ7WFnmWIoknYKz+CwVSki/SOSxrDKnRWipEMsLdIt75PDDDyRRcj
         vwnPiX7SKbD50f4IYRUKfbxBHUfNNDCVuSJhetjX3pz7JyuIJeP3OHfDoLdMhiNlxMG9
         jMZQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1701995057; x=1702599857;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc
         :subject:date:message-id:reply-to;
        bh=ubzGMtQt5HVPM7AOlBax8stZx2xSHeRHEpRUO663LjQ=;
        b=ZW5rNDvCrJZH8/eoB7Q8RbiiOD17Id3Z6Cx2recHuN4IoHHIOIOPnzN06F6W9kkvM1
         MZ1hsM5H9+NEW4pEI9Tk97kte1cO6MMztf5v8vR7c9o571l6a481v/2g7r5GczjIyzZN
         p96A1mmfEenDthEIT0Z7vrBZnafZGWfiY6+jbrmy3prMKG8ebLq37PhhdB+Db3vKsSCu
         1umjUjjrcICr0Mht3b0ln61b5dArSXjug1pM35Q3wfjPzg/jIRJbbry2hyHB1NMc4Aft
         6VNsLrq6r681EkLzgfdr+ALLCF/MEU9zSk9/h+nPjLylgBLxm0isp4SnxGRi5nUWmoEv
         8fDA==
X-Gm-Message-State: AOJu0YwNi/U4tx2qifW1QMgy4ebXYNSwb9qZtss80k9Yk96dMX/zfJrY
        yAC/ZPnqrEmZCHBZs0i/DfkziA==
X-Google-Smtp-Source: 
 AGHT+IHy+MqgBBDx5az8OQsHOLEELahOE1jEa8gSj75qsplNZD/2kfFi6zDqm9ryPOJBMCx8LWusGQ==
X-Received: by 2002:a05:600c:6985:b0:40b:2a08:c45e with SMTP id
 fp5-20020a05600c698500b0040b2a08c45emr18370wmb.3.1701995057152;
        Thu, 07 Dec 2023 16:24:17 -0800 (PST)
Received: from airbuntu.. (host109-153-232-45.range109-153.btcentralplus.com.
 [109.153.232.45])
        by smtp.gmail.com with ESMTPSA id
 u17-20020a05600c19d100b0040c1c269264sm3339653wmq.40.2023.12.07.16.24.16
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Thu, 07 Dec 2023 16:24:16 -0800 (PST)
From: Qais Yousef <qyousef@layalina.io>
To: Ingo Molnar <mingo@kernel.org>,
        Peter Zijlstra <peterz@infradead.org>,
        "Rafael J. Wysocki" <rafael@kernel.org>,
        Viresh Kumar <viresh.kumar@linaro.org>,
        Vincent Guittot <vincent.guittot@linaro.org>,
        Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org,
        Lukasz Luba <lukasz.luba@arm.com>, Wei Wang <wvw@google.com>,
        Rick Yiu <rickyiu@google.com>,
        Chung-Kai Mei <chungkai@google.com>,
        Qais Yousef <qyousef@layalina.io>
Subject: [PATCH v2 7/8] sched/schedutil: Add a new tunable to dictate response
 time
Date: Fri,  8 Dec 2023 00:23:41 +0000
Message-Id: <20231208002342.367117-8-qyousef@layalina.io>
X-Mailer: git-send-email 2.34.1
In-Reply-To: <20231208002342.367117-1-qyousef@layalina.io>
References: <20231208002342.367117-1-qyousef@layalina.io>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Type: text/plain; charset="utf-8"

The new tunable, response_time_ms,  allow us to speed up or slow down
the response time of the policy to meet the perf, power and thermal
characteristic desired by the user/sysadmin. There's no single universal
trade-off that we can apply for all systems even if they use the same
SoC. The form factor of the system, the dominant use case, and in case
of battery powered systems, the size of the battery and presence or
absence of active cooling can play a big role on what would be best to
use.

The new tunable provides sensible defaults, but yet gives the power to
control the response time to the user/sysadmin, if they wish to.

This tunable is applied before we apply the DVFS headroom.

The default behavior of applying 1.25 headroom can be re-instated easily
now. But we continue to keep the min required headroom to overcome
hardware limitation in its speed to change DVFS. And any additional
headroom to speed things up must be applied by userspace to match their
expectation for best perf/watt as it dictates a type of policy that will
be better for some systems, but worse for others.

There's a whitespace clean up included in sugov_start().

Signed-off-by: Qais Yousef (Google) <qyousef@layalina.io>
---
 Documentation/admin-guide/pm/cpufreq.rst |  17 +++-
 drivers/cpufreq/cpufreq.c                |   4 +-
 include/linux/cpufreq.h                  |   3 +
 kernel/sched/cpufreq_schedutil.c         | 115 ++++++++++++++++++++++-
 4 files changed, 132 insertions(+), 7 deletions(-)

diff --git a/Documentation/admin-guide/pm/cpufreq.rst b/Documentation/admin=
-guide/pm/cpufreq.rst
index 6adb7988e0eb..fa0d602a920e 100644
--- a/Documentation/admin-guide/pm/cpufreq.rst
+++ b/Documentation/admin-guide/pm/cpufreq.rst
@@ -417,7 +417,7 @@ is passed by the scheduler to the governor callback whi=
ch causes the frequency
 to go up to the allowed maximum immediately and then draw back to the value
 returned by the above formula over time.
=20
-This governor exposes only one tunable:
+This governor exposes two tunables:
=20
 ``rate_limit_us``
 	Minimum time (in microseconds) that has to pass between two consecutive
@@ -427,6 +427,21 @@ This governor exposes only one tunable:
 	The purpose of this tunable is to reduce the scheduler context overhead
 	of the governor which might be excessive without it.
=20
+``respone_time_ms``
+	Amount of time (in milliseconds) required to ramp the policy from
+	lowest to highest frequency. Can be decreased to speed up the
+	responsiveness of the system, or increased to slow the system down in
+	hope to save power. The best perf/watt will depend on the system
+	characteristics and the dominant workload you expect to run. For
+	userspace that has smart context on the type of workload running (like
+	in Android), one can tune this to suite the demand of that workload.
+
+	Note that when slowing the response down, you can end up effectively
+	chopping off the top frequencies for that policy as the util is capped
+	to 1024. On HMP systems this chopping effect will only occur on the
+	biggest core whose capacity is 1024. Don't rely on this behavior as
+	this is a limitation that can hopefully be improved in the future.
+
 This governor generally is regarded as a replacement for the older `ondema=
nd`_
 and `conservative`_ governors (described below), as it is simpler and more
 tightly integrated with the CPU scheduler, its overhead in terms of CPU co=
ntext
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 9875284ca6e4..15c397ce3252 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -533,8 +533,8 @@ void cpufreq_disable_fast_switch(struct cpufreq_policy =
*policy)
 }
 EXPORT_SYMBOL_GPL(cpufreq_disable_fast_switch);
=20
-static unsigned int __resolve_freq(struct cpufreq_policy *policy,
-		unsigned int target_freq, unsigned int relation)
+unsigned int __resolve_freq(struct cpufreq_policy *policy,
+			    unsigned int target_freq, unsigned int relation)
 {
 	unsigned int idx;
=20
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index 1c5ca92a0555..29c3723653a3 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -613,6 +613,9 @@ int cpufreq_driver_target(struct cpufreq_policy *policy,
 int __cpufreq_driver_target(struct cpufreq_policy *policy,
 				   unsigned int target_freq,
 				   unsigned int relation);
+unsigned int __resolve_freq(struct cpufreq_policy *policy,
+			    unsigned int target_freq,
+			    unsigned int relation);
 unsigned int cpufreq_driver_resolve_freq(struct cpufreq_policy *policy,
 					 unsigned int target_freq);
 unsigned int cpufreq_policy_transition_delay_us(struct cpufreq_policy *pol=
icy);
diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedu=
til.c
index 1d4d6025c15f..788208becc13 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -8,9 +8,12 @@
=20
 #define IOWAIT_BOOST_MIN	(SCHED_CAPACITY_SCALE / 8)
=20
+DEFINE_PER_CPU_READ_MOSTLY(unsigned long, response_time_mult);
+
 struct sugov_tunables {
 	struct gov_attr_set	attr_set;
 	unsigned int		rate_limit_us;
+	unsigned int		response_time_ms;
 };
=20
 struct sugov_policy {
@@ -22,6 +25,7 @@ struct sugov_policy {
 	raw_spinlock_t		update_lock;
 	u64			last_freq_update_time;
 	s64			freq_update_delay_ns;
+	unsigned int		freq_response_time_ms;
 	unsigned int		next_freq;
 	unsigned int		cached_raw_freq;
=20
@@ -59,6 +63,70 @@ static DEFINE_PER_CPU(struct sugov_cpu, sugov_cpu);
=20
 /************************ Governor internals ***********************/
=20
+static inline u64 sugov_calc_freq_response_ms(struct sugov_policy *sg_poli=
cy)
+{
+	int cpu =3D cpumask_first(sg_policy->policy->cpus);
+	unsigned long cap =3D arch_scale_cpu_capacity(cpu);
+	unsigned int max_freq, sec_max_freq;
+
+	max_freq =3D sg_policy->policy->cpuinfo.max_freq;
+	sec_max_freq =3D __resolve_freq(sg_policy->policy,
+				      max_freq - 1,
+				      CPUFREQ_RELATION_H);
+
+	/*
+	 * We will request max_freq as soon as util crosses the capacity at
+	 * second highest frequency. So effectively our response time is the
+	 * util at which we cross the cap@2nd_highest_freq.
+	 */
+	cap =3D sec_max_freq * cap / max_freq;
+
+	return approximate_runtime(cap + 1);
+}
+
+static inline void sugov_update_response_time_mult(struct sugov_policy *sg=
_policy)
+{
+	unsigned long mult;
+	int cpu;
+
+	if (unlikely(!sg_policy->freq_response_time_ms))
+		sg_policy->freq_response_time_ms =3D sugov_calc_freq_response_ms(sg_poli=
cy);
+
+	mult =3D sg_policy->freq_response_time_ms * SCHED_CAPACITY_SCALE;
+	mult /=3D	sg_policy->tunables->response_time_ms;
+
+	if (SCHED_WARN_ON(!mult))
+		mult =3D SCHED_CAPACITY_SCALE;
+
+	for_each_cpu(cpu, sg_policy->policy->cpus)
+		per_cpu(response_time_mult, cpu) =3D mult;
+}
+
+/*
+ * Shrink or expand how long it takes to reach the maximum performance of =
the
+ * policy.
+ *
+ * sg_policy->freq_response_time_ms is a constant value defined by PELT
+ * HALFLIFE and the capacity of the policy (assuming HMP systems).
+ *
+ * sg_policy->tunables->response_time_ms is a user defined response time. =
By
+ * setting it lower than sg_policy->freq_response_time_ms, the system will
+ * respond faster to changes in util, which will result in reaching maximum
+ * performance point quicker. By setting it higher, it'll slow down the am=
ount
+ * of time required to reach the maximum OPP.
+ *
+ * This should be applied when selecting the frequency.
+ */
+static inline unsigned long
+sugov_apply_response_time(unsigned long util, int cpu)
+{
+	unsigned long mult;
+
+	mult =3D per_cpu(response_time_mult, cpu) * util;
+
+	return mult >> SCHED_CAPACITY_SHIFT;
+}
+
 static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 t=
ime)
 {
 	s64 delta_ns;
@@ -156,7 +224,10 @@ unsigned long sugov_effective_cpu_perf(int cpu, unsign=
ed long actual,
 				 unsigned long min,
 				 unsigned long max)
 {
-	/* Add dvfs headroom to actual utilization */
+	/*
+	 * Speed up/slow down response timee first then apply DVFS headroom.
+	 */
+	actual =3D sugov_apply_response_time(actual, cpu);
 	actual =3D apply_dvfs_headroom(actual, cpu);
 	/* Actually we don't need to target the max performance */
 	if (actual < max)
@@ -555,8 +626,42 @@ rate_limit_us_store(struct gov_attr_set *attr_set, con=
st char *buf, size_t count
=20
 static struct governor_attr rate_limit_us =3D __ATTR_RW(rate_limit_us);
=20
+static ssize_t response_time_ms_show(struct gov_attr_set *attr_set, char *=
buf)
+{
+	struct sugov_tunables *tunables =3D to_sugov_tunables(attr_set);
+
+	return sprintf(buf, "%u\n", tunables->response_time_ms);
+}
+
+static ssize_t
+response_time_ms_store(struct gov_attr_set *attr_set, const char *buf, siz=
e_t count)
+{
+	struct sugov_tunables *tunables =3D to_sugov_tunables(attr_set);
+	struct sugov_policy *sg_policy;
+	unsigned int response_time_ms;
+
+	if (kstrtouint(buf, 10, &response_time_ms))
+		return -EINVAL;
+
+	/* XXX need special handling for high values? */
+
+	tunables->response_time_ms =3D response_time_ms;
+
+	list_for_each_entry(sg_policy, &attr_set->policy_list, tunables_hook) {
+		if (sg_policy->tunables =3D=3D tunables) {
+			sugov_update_response_time_mult(sg_policy);
+			break;
+		}
+	}
+
+	return count;
+}
+
+static struct governor_attr response_time_ms =3D __ATTR_RW(response_time_m=
s);
+
 static struct attribute *sugov_attrs[] =3D {
 	&rate_limit_us.attr,
+	&response_time_ms.attr,
 	NULL
 };
 ATTRIBUTE_GROUPS(sugov);
@@ -744,11 +849,13 @@ static int sugov_init(struct cpufreq_policy *policy)
 		goto stop_kthread;
 	}
=20
-	tunables->rate_limit_us =3D cpufreq_policy_transition_delay_us(policy);
-
 	policy->governor_data =3D sg_policy;
 	sg_policy->tunables =3D tunables;
=20
+	tunables->rate_limit_us =3D cpufreq_policy_transition_delay_us(policy);
+	tunables->response_time_ms =3D sugov_calc_freq_response_ms(sg_policy);
+	sugov_update_response_time_mult(sg_policy);
+
 	ret =3D kobject_init_and_add(&tunables->attr_set.kobj, &sugov_tunables_kt=
ype,
 				   get_governor_parent_kobj(policy), "%s",
 				   schedutil_gov.name);
@@ -808,7 +915,7 @@ static int sugov_start(struct cpufreq_policy *policy)
 	void (*uu)(struct update_util_data *data, u64 time, unsigned int flags);
 	unsigned int cpu;
=20
-	sg_policy->freq_update_delay_ns	=3D sg_policy->tunables->rate_limit_us * =
NSEC_PER_USEC;
+	sg_policy->freq_update_delay_ns		=3D sg_policy->tunables->rate_limit_us *=
 NSEC_PER_USEC;
 	sg_policy->last_freq_update_time	=3D 0;
 	sg_policy->next_freq			=3D 0;
 	sg_policy->work_in_progress		=3D false;
--=20
2.34.1
From nobody Mon Feb  9 16:23:52 2026
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 9BC12C4167B
	for <linux-kernel@archiver.kernel.org>; Fri,  8 Dec 2023 00:24:52 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S235746AbjLHAYn (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Thu, 7 Dec 2023 19:24:43 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46820 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1444181AbjLHAYV (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Thu, 7 Dec 2023 19:24:21 -0500
Received: from mail-wr1-x436.google.com (mail-wr1-x436.google.com
 [IPv6:2a00:1450:4864:20::436])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 057B5173C
        for <linux-kernel@vger.kernel.org>;
 Thu,  7 Dec 2023 16:24:20 -0800 (PST)
Received: by mail-wr1-x436.google.com with SMTP id
 ffacd0b85a97d-33349b3f99aso1598845f8f.0
        for <linux-kernel@vger.kernel.org>;
 Thu, 07 Dec 2023 16:24:20 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=layalina-io.20230601.gappssmtp.com; s=20230601; t=1701995059;
 x=1702599859; darn=vger.kernel.org;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:from:to:cc:subject:date
         :message-id:reply-to;
        bh=WhR18T0pl795l1uokayZxS7knBKykZG2R0Wh7Y1uyRw=;
        b=Plhi8pdnebPMWe8kgb9+o9Pz1IuoMKY0oZcZmNT4Bw7kDfx01XCrudolutc3wYM39p
         SULrJT4zhXQ3wKWImyLraQRHi9eWRegPaKXb8GbgpVSO+4U5jNHdWEcJ0qsUN3JPjxVQ
         DE1Zr6o3BES+hBF1e4cWPoi1vjXGCae/tvz8z2c1LHTHPrWHVpn83X61v6oVnjKqnGC3
         MMeBWO+hixdQztAmWsw0KXVvlKhm+qxg9e1he8f3cGKcE8r6FFr6wDHB5gli+zc2uzVk
         skv1igsi/pCaw4EqP6neufD+ewKC9P93V+fJGNtERq3f5SjZImh0gqkxyMMkC/n50Kx+
         9kQw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1701995059; x=1702599859;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc
         :subject:date:message-id:reply-to;
        bh=WhR18T0pl795l1uokayZxS7knBKykZG2R0Wh7Y1uyRw=;
        b=jBwpkJnsMv461Uaxhz/4vWMSCBkz/7qP8yYcDGrHeAyA9URKUfsI9yksWjBRMVRfiY
         PeMmQji1+79UgggDMANnYnxAybyLQy+U4xSQc3uk72R/1U6ev+jTnXjB7lPXLTq/jolf
         bu5NzAHALH43J5921XHM2wbC5C6bWKQc+j2/gFLxrWTvYvZM9EuGDBKm9V2DkgWBWXlo
         WH2nYFki9xHyfymB+UHQuvYWrG2ZyeK2LJBu65Ny3EGUKL53RgchkECo6nPxtp0ACYGt
         07OKlXmUyhp84p3Im1aoFLECnx6R2sV8hMHw8DQClp1sEQY2Ya+AhG0L2qij8315ww5I
         W63A==
X-Gm-Message-State: AOJu0Yy2SZ3GkaGGWFCSfc2MqVHJenlEGeI9C8EagcuFNOp9Fp0volPk
        t9r1IyLEjNFgc29tGCR1t+soVg==
X-Google-Smtp-Source: 
 AGHT+IHGV+dihn4V479bZq9OG6SLrPXcDvTxVqEsxDxJhTAT29NPCNJcjozrDlu2bVsOek2GebRGZw==
X-Received: by 2002:a05:600c:4b27:b0:40c:22b6:ce9d with SMTP id
 i39-20020a05600c4b2700b0040c22b6ce9dmr1891284wmp.141.1701995059146;
        Thu, 07 Dec 2023 16:24:19 -0800 (PST)
Received: from airbuntu.. (host109-153-232-45.range109-153.btcentralplus.com.
 [109.153.232.45])
        by smtp.gmail.com with ESMTPSA id
 u17-20020a05600c19d100b0040c1c269264sm3339653wmq.40.2023.12.07.16.24.18
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Thu, 07 Dec 2023 16:24:18 -0800 (PST)
From: Qais Yousef <qyousef@layalina.io>
To: Ingo Molnar <mingo@kernel.org>,
        Peter Zijlstra <peterz@infradead.org>,
        "Rafael J. Wysocki" <rafael@kernel.org>,
        Viresh Kumar <viresh.kumar@linaro.org>,
        Vincent Guittot <vincent.guittot@linaro.org>,
        Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org,
        Lukasz Luba <lukasz.luba@arm.com>, Wei Wang <wvw@google.com>,
        Rick Yiu <rickyiu@google.com>,
        Chung-Kai Mei <chungkai@google.com>,
        Qais Yousef <qyousef@layalina.io>
Subject: [PATCH v2 8/8] sched/pelt: Introduce PELT multiplier
Date: Fri,  8 Dec 2023 00:23:42 +0000
Message-Id: <20231208002342.367117-9-qyousef@layalina.io>
X-Mailer: git-send-email 2.34.1
In-Reply-To: <20231208002342.367117-1-qyousef@layalina.io>
References: <20231208002342.367117-1-qyousef@layalina.io>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Type: text/plain; charset="utf-8"

From: Vincent Donnefort <vincent.donnefort@arm.com>

The new sched_pelt_multiplier boot param allows a user to set a clock
multiplier to x2 or x4 (x1 being the default). This clock multiplier
artificially speeds up PELT ramp up/down similarly to use a faster
half-life than the default 32ms.

  - x1: 32ms half-life
  - x2: 16ms half-life
  - x4: 8ms  half-life

Internally, a new clock is created: rq->clock_task_mult. It sits in the
clock hierarchy between rq->clock_task and rq->clock_pelt.

The param is set as read only and can only be changed at boot time via

	kernel.sched_pelt_multiplier=3D[1, 2, 4]

PELT has a big impact on the overall system response and reactiveness to
change. Smaller PELT HF means it'll require less time to reach the
maximum performance point of the system when the system become fully
busy; and equally shorter time to go back to lowest performance point
when the system goes back to idle.

This faster reaction impacts both dvfs response and migration time
between clusters in HMP system.

Smaller PELT values are expected to give better performance at the cost
of more power. Under powered systems can particularly benefit from
smaller values. Powerful systems can still benefit from smaller values
if they want to be tuned towards perf more and power is not the major
concern for them.

This combined with respone_time_ms from schedutil should give the user
and sysadmin a deterministic way to control the triangular power, perf
and thermals for their system. The default response_time_ms will half
as PELT HF halves.

Update approximate_{util_avg, runtime}() to take into account the PELT
HALFLIFE multiplier.

Signed-off-by: Vincent Donnefort <vincent.donnefort@arm.com>
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
[Converted from sysctl to boot param and updated commit message]
Signed-off-by: Qais Yousef (Google) <qyousef@layalina.io>
---
 kernel/sched/core.c  |  2 +-
 kernel/sched/pelt.c  | 52 ++++++++++++++++++++++++++++++++++++++++++--
 kernel/sched/pelt.h  | 42 +++++++++++++++++++++++++++++++----
 kernel/sched/sched.h |  1 +
 4 files changed, 90 insertions(+), 7 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index b4a1c8ea9e12..9c8626b4ddff 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -745,7 +745,7 @@ static void update_rq_clock_task(struct rq *rq, s64 del=
ta)
 	if ((irq_delta + steal) && sched_feat(NONTASK_CAPACITY))
 		update_irq_load_avg(rq, irq_delta + steal);
 #endif
-	update_rq_clock_pelt(rq, delta);
+	update_rq_clock_task_mult(rq, delta);
 }
=20
 void update_rq_clock(struct rq *rq)
diff --git a/kernel/sched/pelt.c b/kernel/sched/pelt.c
index 00a1b9c1bf16..0a10e56f76c7 100644
--- a/kernel/sched/pelt.c
+++ b/kernel/sched/pelt.c
@@ -468,6 +468,54 @@ int update_irq_load_avg(struct rq *rq, u64 running)
 }
 #endif /* CONFIG_HAVE_SCHED_AVG_IRQ */
=20
+__read_mostly unsigned int sched_pelt_lshift;
+static unsigned int sched_pelt_multiplier =3D 1;
+
+static int set_sched_pelt_multiplier(const char *val, const struct kernel_=
param *kp)
+{
+	int ret;
+
+	ret =3D param_set_int(val, kp);
+	if (ret)
+		goto error;
+
+	switch (sched_pelt_multiplier)  {
+	case 1:
+		fallthrough;
+	case 2:
+		fallthrough;
+	case 4:
+		WRITE_ONCE(sched_pelt_lshift,
+			   sched_pelt_multiplier >> 1);
+		break;
+	default:
+		ret =3D -EINVAL;
+		goto error;
+	}
+
+	return 0;
+
+error:
+	sched_pelt_multiplier =3D 1;
+	return ret;
+}
+
+static const struct kernel_param_ops sched_pelt_multiplier_ops =3D {
+	.set =3D set_sched_pelt_multiplier,
+	.get =3D param_get_int,
+};
+
+#ifdef MODULE_PARAM_PREFIX
+#undef MODULE_PARAM_PREFIX
+#endif
+/* XXX: should we use sched as prefix? */
+#define MODULE_PARAM_PREFIX "kernel."
+module_param_cb(sched_pelt_multiplier, &sched_pelt_multiplier_ops, &sched_=
pelt_multiplier, 0444);
+MODULE_PARM_DESC(sched_pelt_multiplier, "PELT HALFLIFE helps control the r=
esponsiveness of the system.");
+MODULE_PARM_DESC(sched_pelt_multiplier, "Accepted value: 1 32ms PELT HALIF=
E - roughly 200ms to go from 0 to max performance point (default).");
+MODULE_PARM_DESC(sched_pelt_multiplier, "                2 16ms PELT HALIF=
E - roughly 100ms to go from 0 to max performance point.");
+MODULE_PARM_DESC(sched_pelt_multiplier, "                4  8ms PELT HALIF=
E - roughly  50ms to go from 0 to max performance point.");
+
 /*
  * Approximate the new util_avg value assuming an entity has continued to =
run
  * for @delta us.
@@ -482,7 +530,7 @@ unsigned long approximate_util_avg(unsigned long util, =
u64 delta)
 	if (unlikely(!delta))
 		return util;
=20
-	accumulate_sum(delta, &sa, 1, 0, 1);
+	accumulate_sum(delta << sched_pelt_lshift, &sa, 1, 0, 1);
 	___update_load_avg(&sa, 0);
=20
 	return sa.util_avg;
@@ -494,7 +542,7 @@ unsigned long approximate_util_avg(unsigned long util, =
u64 delta)
 u64 approximate_runtime(unsigned long util)
 {
 	struct sched_avg sa =3D {};
-	u64 delta =3D 1024; // period =3D 1024 =3D ~1ms
+	u64 delta =3D 1024 << sched_pelt_lshift; // period =3D 1024 =3D ~1ms
 	u64 runtime =3D 0;
=20
 	if (unlikely(!util))
diff --git a/kernel/sched/pelt.h b/kernel/sched/pelt.h
index 3a0e0dc28721..9b35b5072bae 100644
--- a/kernel/sched/pelt.h
+++ b/kernel/sched/pelt.h
@@ -61,6 +61,14 @@ static inline void cfs_se_util_change(struct sched_avg *=
avg)
 	WRITE_ONCE(avg->util_est.enqueued, enqueued);
 }
=20
+static inline u64 rq_clock_task_mult(struct rq *rq)
+{
+	lockdep_assert_rq_held(rq);
+	assert_clock_updated(rq);
+
+	return rq->clock_task_mult;
+}
+
 static inline u64 rq_clock_pelt(struct rq *rq)
 {
 	lockdep_assert_rq_held(rq);
@@ -72,7 +80,7 @@ static inline u64 rq_clock_pelt(struct rq *rq)
 /* The rq is idle, we can sync to clock_task */
 static inline void _update_idle_rq_clock_pelt(struct rq *rq)
 {
-	rq->clock_pelt  =3D rq_clock_task(rq);
+	rq->clock_pelt =3D rq_clock_task_mult(rq);
=20
 	u64_u32_store(rq->clock_idle, rq_clock(rq));
 	/* Paired with smp_rmb in migrate_se_pelt_lag() */
@@ -121,6 +129,27 @@ static inline void update_rq_clock_pelt(struct rq *rq,=
 s64 delta)
 	rq->clock_pelt +=3D delta;
 }
=20
+extern unsigned int sched_pelt_lshift;
+
+/*
+ * absolute time   |1      |2      |3      |4      |5      |6      |
+ * @ mult =3D 1      --------****************--------****************-
+ * @ mult =3D 2      --------********----------------********---------
+ * @ mult =3D 4      --------****--------------------****-------------
+ * clock task mult
+ * @ mult =3D 2      |   |   |2  |3  |   |   |   |   |5  |6  |   |   |
+ * @ mult =3D 4      | | | | |2|3| | | | | | | | | | |5|6| | | | | | |
+ *
+ */
+static inline void update_rq_clock_task_mult(struct rq *rq, s64 delta)
+{
+	delta <<=3D READ_ONCE(sched_pelt_lshift);
+
+	rq->clock_task_mult +=3D delta;
+
+	update_rq_clock_pelt(rq, delta);
+}
+
 /*
  * When rq becomes idle, we have to check if it has lost idle time
  * because it was fully busy. A rq is fully used when the /Sum util_sum
@@ -147,7 +176,7 @@ static inline void update_idle_rq_clock_pelt(struct rq =
*rq)
 	 * rq's clock_task.
 	 */
 	if (util_sum >=3D divider)
-		rq->lost_idle_time +=3D rq_clock_task(rq) - rq->clock_pelt;
+		rq->lost_idle_time +=3D rq_clock_task_mult(rq) - rq->clock_pelt;
=20
 	_update_idle_rq_clock_pelt(rq);
 }
@@ -218,13 +247,18 @@ update_irq_load_avg(struct rq *rq, u64 running)
 	return 0;
 }
=20
-static inline u64 rq_clock_pelt(struct rq *rq)
+static inline u64 rq_clock_task_mult(struct rq *rq)
 {
 	return rq_clock_task(rq);
 }
=20
+static inline u64 rq_clock_pelt(struct rq *rq)
+{
+	return rq_clock_task_mult(rq);
+}
+
 static inline void
-update_rq_clock_pelt(struct rq *rq, s64 delta) { }
+update_rq_clock_task_mult(struct rq *rq, s64 delta) { }
=20
 static inline void
 update_idle_rq_clock_pelt(struct rq *rq) { }
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index bbece0eb053a..a7c89c623250 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1029,6 +1029,7 @@ struct rq {
 	u64			clock;
 	/* Ensure that all clocks are in the same cache line */
 	u64			clock_task ____cacheline_aligned;
+	u64			clock_task_mult;
 	u64			clock_pelt;
 	unsigned long		lost_idle_time;
 	u64			clock_pelt_idle;
--=20
2.34.1