arch/x86/kernel/cpu/intel_epb.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-)
There are certain scenarios where it may be intentional that the EPB was
set at to 0/ENERGY_PERF_BIAS_PERFORMANCE on kernel boot. For example, in
data centers a kexec/live-update of the kernel may be performed regularly.
Usually this live-update is time critical and defaulting of the bias back
to ENERGY_PERF_BIAS_NORMAL may actually be detrimental to the overall
update time if processors' time to ramp up/boost are affected.
This patch introduces a kernel command line "intel_epb_no_override"
which will leave the EPB at performance if during the restoration code path
it is detected as such.
Signed-off-by: Jack Allister <jalliste@amazon.com>
Cc: Paul Durrant <pdurrant@amazon.com>
Cc: Jue Wang <juew@amazon.com>
Cc: Usama Arif <usama.arif@bytedance.com>
---
arch/x86/kernel/cpu/intel_epb.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/cpu/intel_epb.c b/arch/x86/kernel/cpu/intel_epb.c
index e4c3ba91321c..cbe0e224b8d9 100644
--- a/arch/x86/kernel/cpu/intel_epb.c
+++ b/arch/x86/kernel/cpu/intel_epb.c
@@ -50,7 +50,8 @@
* the OS will do that anyway. That sometimes is problematic, as it may cause
* the system battery to drain too fast, for example, so it is better to adjust
* it on CPU bring-up and if the initial EPB value for a given CPU is 0, the
- * kernel changes it to 6 ('normal').
+ * kernel changes it to 6 ('normal'). This however is overridable via
+ * intel_epb_no_override if required.
*/
static DEFINE_PER_CPU(u8, saved_epb);
@@ -75,6 +76,8 @@ static u8 energ_perf_values[] = {
[EPB_INDEX_POWERSAVE] = ENERGY_PERF_BIAS_POWERSAVE,
};
+static bool intel_epb_no_override __read_mostly;
+
static int intel_epb_save(void)
{
u64 epb;
@@ -106,7 +109,7 @@ static void intel_epb_restore(void)
* ('normal').
*/
val = epb & EPB_MASK;
- if (val == ENERGY_PERF_BIAS_PERFORMANCE) {
+ if (!intel_epb_no_override && val == ENERGY_PERF_BIAS_PERFORMANCE) {
val = energ_perf_values[EPB_INDEX_NORMAL];
pr_warn_once("ENERGY_PERF_BIAS: Set to 'normal', was 'performance'\n");
}
@@ -213,6 +216,12 @@ static const struct x86_cpu_id intel_epb_normal[] = {
{}
};
+static __init int intel_epb_no_override_setup(char *str)
+{
+ return kstrtobool(str, &intel_epb_no_override);
+}
+early_param("intel_epb_no_override", intel_epb_no_override_setup);
+
static __init int intel_epb_init(void)
{
const struct x86_cpu_id *id = x86_match_cpu(intel_epb_normal);
--
2.40.1
Sorry it looks like I had missed the v2 flag from the subject, also the
commit message did not include the correct rename compared to v1.
This should all be fixed in v3 now.
On 12/5/23 05:23, Jack Allister wrote: > There are certain scenarios where it may be intentional that the EPB was > set at to 0/ENERGY_PERF_BIAS_PERFORMANCE on kernel boot. For example, in > data centers a kexec/live-update of the kernel may be performed regularly. > > Usually this live-update is time critical and defaulting of the bias back > to ENERGY_PERF_BIAS_NORMAL may actually be detrimental to the overall > update time if processors' time to ramp up/boost are affected. If this makes your kexecs 7 times faster, please say that here. Could we also please make this less wishy-washy? "May actually be detrimental" does not scream how critical this is for you. > This patch introduces a kernel command line "intel_epb_no_override" > which will leave the EPB at performance if during the restoration code path > it is detected as such. No "this patch", please: https://www.kernel.org/doc/html/next/process/maintainer-tip.html This also needs documentation of the parameter in Documentation/admin-guide/kernel-parameters.txt. Let me see if I can write a sane changelog, summarizing the discussion here for posterity. If there's confusion about a v1 patch that's cleared up in the discussion, it would be wonderful to capture that in the v2 changelog as opposed to making minimal changes. How's this? I think it captures some of the things that Rafael related and also additional information about the use case that motivated this effort. -- Buggy BIOSes set a sane boot-time Energy Performance Bias (EPB) that causes overheating. The kernel overrides any boot-time EPB "performance" bias to "normal" to avoid this. <Hardware name here> platforms can tolerate a "performance" bias during boot without overheating. In addition, because of <root cause(s) here>, a kexec with a "normal" bias is seven times slower than "performance" to perform the kexec. Boot time is critical when performing a kexec/live-update of the kernel which is running guests VMs since boot time appears as guest latency or downtime. Introduce a command-line parameter, "intel_epb_no_override", to skip the "performance"=>"normal" override. This allows folks to get a speedy kexec without exposing other folks with wonky BIOSes to overheating.
© 2016 - 2025 Red Hat, Inc.