From nobody Fri Dec 19 19:17:16 2025 Received: from smtp-fw-6001.amazon.com (smtp-fw-6001.amazon.com [52.95.48.154]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3A488208AA; Thu, 4 Jan 2024 09:06:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.co.uk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="mmErWhrp" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1704359169; x=1735895169; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=PRSjukXkmLIXcwBItHRAGuv5kZcF5d0JPSbYQAMvWf4=; b=mmErWhrpZmT5REosIuhZsTGTHwbcrcyDQMJ8Fz0NruuvacYhnJJI2Bkr Z9uVQ2/t6wCqOG2Jsr3toPEr/XzcD4JELRIedKwFOR2GOkx7r+5c7Bxg5 dxPpmPbyxklpKa40pUt7iMIUoQrOz3xbd8KjC94ELMaB1ei6m/tc5GFVP s=; X-IronPort-AV: E=Sophos;i="6.04,330,1695686400"; d="scan'208";a="380164508" Received: from iad12-co-svc-p1-lb1-vlan2.amazon.com (HELO email-inbound-relay-iad-1e-m6i4x-6e7a78d7.us-east-1.amazon.com) ([10.43.8.2]) by smtp-border-fw-6001.iad6.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Jan 2024 09:06:06 +0000 Received: from smtpout.prod.us-east-1.prod.farcaster.email.amazon.dev (iad7-ws-svc-p70-lb3-vlan2.iad.amazon.com [10.32.235.34]) by email-inbound-relay-iad-1e-m6i4x-6e7a78d7.us-east-1.amazon.com (Postfix) with ESMTPS id 06DCE803F1; Thu, 4 Jan 2024 09:05:57 +0000 (UTC) Received: from EX19MTAEUB002.ant.amazon.com [10.0.17.79:49859] by smtpin.naws.eu-west-1.prod.farcaster.email.amazon.dev [10.0.3.99:2525] with esmtp (Farcaster) id d8b6c571-1e69-4fed-9426-64179444ede8; Thu, 4 Jan 2024 09:05:56 +0000 (UTC) X-Farcaster-Flow-ID: d8b6c571-1e69-4fed-9426-64179444ede8 Received: from EX19D033EUB001.ant.amazon.com (10.252.61.11) by EX19MTAEUB002.ant.amazon.com (10.252.51.59) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Thu, 4 Jan 2024 09:05:56 +0000 Received: from EX19MTAUEC001.ant.amazon.com (10.252.135.222) by EX19D033EUB001.ant.amazon.com (10.252.61.11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Thu, 4 Jan 2024 09:05:55 +0000 Received: from dev-dsk-jalliste-1c-e3349c3e.eu-west-1.amazon.com (10.13.244.142) by mail-relay.amazon.com (10.252.135.200) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40 via Frontend Transport; Thu, 4 Jan 2024 09:05:54 +0000 From: Jack Allister To: CC: Jack Allister , "Rafael J . Wysocki" , Paul Durrant , Jue Wang , Usama Arif , Jonathan Corbet , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , , "H. Peter Anvin" , "Paul E. McKenney" , Randy Dunlap , Tejun Heo , Peter Zijlstra , Yan-Jie Wang , Hans de Goede , , Subject: [PATCH v6] x86: intel_epb: Add earlyparam option to keep bias at performance Date: Thu, 4 Jan 2024 09:05:48 +0000 Message-ID: <20240104090551.46251-1-jalliste@amazon.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Buggy BIOSes may not set a sane boot-time Energy Performance Bias (EPB). A result of this may be overheating or excess power usage. The kernel overrides any boot-time EPB "performance" bias to "normal" to avoid this. When used in data centers it is preferable keep the EPB at "performance" when performing a live-update of the host kernel via a kexec to the new kernel. This is due to boot-time being critical when performing the kexec as running guest VMs will perceieve this as latency or downtime. On Intel Xeon Ice Lake platforms it has been observed that a combination of EPB being set to "normal" alongside HWP (Intel Hardware P-states) being enabled/configured during or close to the kexec causes an increases the live-update/kexec downtime by 7 times compared to when the EPB is set to "performance". Introduce a command-line parameter, "intel_epb=3Dpreserve", to skip the "performance" -> "normal" override/workaround. This maintains prior functionality when no parameter is set, but adds in the ability to stay at performance for a speedy kexec if a user wishes. Signed-off-by: Jack Allister Acked-by: Rafael J. Wysocki Cc: Paul Durrant Cc: Jue Wang Cc: Usama Arif Reviewed-by: Paul Durrant --- .../admin-guide/kernel-parameters.txt | 9 ++++++++ arch/x86/kernel/cpu/intel_epb.c | 22 +++++++++++++++++-- 2 files changed, 29 insertions(+), 2 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentatio= n/admin-guide/kernel-parameters.txt index 65731b060e3f..d28f2fc41c0c 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -2148,6 +2148,15 @@ 0 disables intel_idle and fall back on acpi_idle. 1 to 9 specify maximum depth of C-state. =20 + intel_epb=3D [X86] + auto (default) + Work around buggy BIOSes to avoid excess power usage + by forcing the performance bias to "normal" at boot-time. + preserve + Do not override the existing performance bias setting. + Useful if a previous kernel or bootloader's setting is + more desirable than "normal". + intel_pstate=3D [X86] disable Do not enable intel_pstate as the default diff --git a/arch/x86/kernel/cpu/intel_epb.c b/arch/x86/kernel/cpu/intel_ep= b.c index e4c3ba91321c..01d406177751 100644 --- a/arch/x86/kernel/cpu/intel_epb.c +++ b/arch/x86/kernel/cpu/intel_epb.c @@ -50,7 +50,8 @@ * the OS will do that anyway. That sometimes is problematic, as it may c= ause * the system battery to drain too fast, for example, so it is better to a= djust * it on CPU bring-up and if the initial EPB value for a given CPU is 0, t= he - * kernel changes it to 6 ('normal'). + * kernel changes it to 6 ('normal'). However, if it is desirable to retai= n the + * original initial EPB value, intel_epb=3Dpreserve can be set to enforce = it. */ =20 static DEFINE_PER_CPU(u8, saved_epb); @@ -75,6 +76,8 @@ static u8 energ_perf_values[] =3D { [EPB_INDEX_POWERSAVE] =3D ENERGY_PERF_BIAS_POWERSAVE, }; =20 +static bool intel_epb_no_override __read_mostly; + static int intel_epb_save(void) { u64 epb; @@ -106,7 +109,7 @@ static void intel_epb_restore(void) * ('normal'). */ val =3D epb & EPB_MASK; - if (val =3D=3D ENERGY_PERF_BIAS_PERFORMANCE) { + if (!intel_epb_no_override && val =3D=3D ENERGY_PERF_BIAS_PERFORMANCE) { val =3D energ_perf_values[EPB_INDEX_NORMAL]; pr_warn_once("ENERGY_PERF_BIAS: Set to 'normal', was 'performance'\n"); } @@ -213,6 +216,21 @@ static const struct x86_cpu_id intel_epb_normal[] =3D { {} }; =20 +static __init int parse_intel_epb(char *str) +{ + if (!str) + return 0; + + /* "intel_epb=3Dpreserve" prevents PERFORMANCE->NORMAL on restore. */ + if (!strcmp(str, "preserve")) + intel_epb_no_override =3D true; + + /* "intel_epb=3Dauto" not explicitly checked as default behaviour. */ + return 0; +} + +early_param("intel_epb", parse_intel_epb); + static __init int intel_epb_init(void) { const struct x86_cpu_id *id =3D x86_match_cpu(intel_epb_normal); --=20 2.40.1