[PATCH v3] x86: intel_epb: Add earlyparam option to keep bias at performance

Jack Allister posted 1 patch 2 years ago
There is a newer version of this series
arch/x86/kernel/cpu/intel_epb.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
[PATCH v3] x86: intel_epb: Add earlyparam option to keep bias at performance
Posted by Jack Allister 2 years ago
There are certain scenarios where it may be intentional that the EPB was
set at to 0/ENERGY_PERF_BIAS_PERFORMANCE on kernel boot. For example, in
data centers a kexec/live-update of the kernel may be performed regularly.

Usually this live-update is time critical and defaulting of the bias back
to ENERGY_PERF_BIAS_NORMAL may actually be detrimental to the overall
update time if processors' time to ramp up/boost are affected.

This patch introduces a kernel command line "intel_epb_no_override"
which will leave the EPB at performance if during the restoration code path
it is detected as such.

Signed-off-by: Jack Allister <jalliste@amazon.com>
Cc: Paul Durrant <pdurrant@amazon.com>
Cc: Jue Wang <juew@amazon.com>
Cc: Usama Arif <usama.arif@bytedance.com>
---
 arch/x86/kernel/cpu/intel_epb.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/intel_epb.c b/arch/x86/kernel/cpu/intel_epb.c
index e4c3ba91321c..cbe0e224b8d9 100644
--- a/arch/x86/kernel/cpu/intel_epb.c
+++ b/arch/x86/kernel/cpu/intel_epb.c
@@ -50,7 +50,8 @@
  * the OS will do that anyway.  That sometimes is problematic, as it may cause
  * the system battery to drain too fast, for example, so it is better to adjust
  * it on CPU bring-up and if the initial EPB value for a given CPU is 0, the
- * kernel changes it to 6 ('normal').
+ * kernel changes it to 6 ('normal'). This however is overridable via
+ * intel_epb_no_override if required.
  */
 
 static DEFINE_PER_CPU(u8, saved_epb);
@@ -75,6 +76,8 @@ static u8 energ_perf_values[] = {
 	[EPB_INDEX_POWERSAVE] = ENERGY_PERF_BIAS_POWERSAVE,
 };
 
+static bool intel_epb_no_override __read_mostly;
+
 static int intel_epb_save(void)
 {
 	u64 epb;
@@ -106,7 +109,7 @@ static void intel_epb_restore(void)
 		 * ('normal').
 		 */
 		val = epb & EPB_MASK;
-		if (val == ENERGY_PERF_BIAS_PERFORMANCE) {
+		if (!intel_epb_no_override && val == ENERGY_PERF_BIAS_PERFORMANCE) {
 			val = energ_perf_values[EPB_INDEX_NORMAL];
 			pr_warn_once("ENERGY_PERF_BIAS: Set to 'normal', was 'performance'\n");
 		}
@@ -213,6 +216,12 @@ static const struct x86_cpu_id intel_epb_normal[] = {
 	{}
 };
 
+static __init int intel_epb_no_override_setup(char *str)
+{
+	return kstrtobool(str, &intel_epb_no_override);
+}
+early_param("intel_epb_no_override", intel_epb_no_override_setup);
+
 static __init int intel_epb_init(void)
 {
 	const struct x86_cpu_id *id = x86_match_cpu(intel_epb_normal);
-- 
2.40.1

Sorry it looks like I had missed the v2 flag from the subject, also the
commit message did not include the correct rename compared to v1.

This should all be fixed in v3 now.
Re: [PATCH v3] x86: intel_epb: Add earlyparam option to keep bias at performance
Posted by Dave Hansen 2 years ago
On 12/5/23 05:23, Jack Allister wrote:
> There are certain scenarios where it may be intentional that the EPB was
> set at to 0/ENERGY_PERF_BIAS_PERFORMANCE on kernel boot. For example, in
> data centers a kexec/live-update of the kernel may be performed regularly.
> 
> Usually this live-update is time critical and defaulting of the bias back
> to ENERGY_PERF_BIAS_NORMAL may actually be detrimental to the overall
> update time if processors' time to ramp up/boost are affected.

If this makes your kexecs 7 times faster, please say that here.

Could we also please make this less wishy-washy?  "May actually be
detrimental" does not scream how critical this is for you.

> This patch introduces a kernel command line "intel_epb_no_override"
> which will leave the EPB at performance if during the restoration code path
> it is detected as such.

No "this patch", please:

	https://www.kernel.org/doc/html/next/process/maintainer-tip.html

This also needs documentation of the parameter in
Documentation/admin-guide/kernel-parameters.txt.

Let me see if I can write a sane changelog, summarizing the discussion
here for posterity.  If there's confusion about a v1 patch that's
cleared up in the discussion, it would be wonderful to capture that in
the v2 changelog as opposed to making minimal changes.  How's this?  I
think it captures some of the things that Rafael related and also
additional information about the use case that motivated this effort.

--

Buggy BIOSes set a sane boot-time Energy Performance Bias (EPB) that
causes overheating.  The kernel overrides any boot-time EPB
"performance" bias to "normal" to avoid this.

<Hardware name here> platforms can tolerate a "performance" bias during
boot without overheating.  In addition, because of <root cause(s) here>,
a kexec with a "normal" bias is seven times slower than "performance" to
perform the kexec.  Boot time is critical when performing a
kexec/live-update of the kernel which is running guests VMs since boot
time appears as guest latency or downtime.

Introduce a command-line parameter, "intel_epb_no_override", to skip the
"performance"=>"normal" override.  This allows folks to get a speedy
kexec without exposing other folks with wonky BIOSes to overheating.