[tip: x86/cpu] x86/cpu: Don't clear X86_FEATURE_LAHF_LM flag in init_amd_k8() on AMD when running in a virtual machine

tip-bot2 for Max Grobecker posted 1 patch 9 months, 2 weeks ago
There is a newer version of this series
arch/x86/kernel/cpu/amd.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
[tip: x86/cpu] x86/cpu: Don't clear X86_FEATURE_LAHF_LM flag in init_amd_k8() on AMD when running in a virtual machine
Posted by tip-bot2 for Max Grobecker 9 months, 2 weeks ago
The following commit has been merged into the x86/cpu branch of tip:

Commit-ID:     faf7a039659bf8e8afaf4cb9e0106b268e9acdb4
Gitweb:        https://git.kernel.org/tip/faf7a039659bf8e8afaf4cb9e0106b268e9acdb4
Author:        Max Grobecker <max@grobecker.info>
AuthorDate:    Thu, 27 Feb 2025 21:45:05 +01:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Thu, 27 Feb 2025 21:45:20 +01:00

x86/cpu: Don't clear X86_FEATURE_LAHF_LM flag in init_amd_k8() on AMD when running in a virtual machine

When running in a virtual machine, we might see the original hardware CPU
vendor string (i.e. "AuthenticAMD"), but a model and family ID set by the
hypervisor. In case we run on AMD hardware and the hypervisor sets a model
ID < 0x14, the LAHF cpu feature is eliminated from the the list of CPU
capabilities present to circumvent a bug with some BIOSes in conjunction with
AMD K8 processors.

Parsing the flags list from /proc/cpuinfo seems to be happening mostly in
bash scripts and prebuilt Docker containers, as it does not need to have
additionals tools present – even though more reliable ways like using "kcpuid",
which calls the CPUID instruction instead of parsing a list, should be preferred.
Scripts, that use /proc/cpuinfo to determine if the current CPU is
"compliant" with defined microarchitecture levels like x86-64-v2 will falsely
claim the CPU is incapable of modern CPU instructions when "lahf_lm" is missing
in that flags list.

This can prevent some docker containers from starting or build scripts to create
unoptimized binaries.

Admittably, this is more a small inconvenience than a severe bug in the kernel
and the shoddy scripts that rely on parsing /proc/cpuinfo
should be fixed instead.

This patch adds an additional check to see if we're running inside a
virtual machine (X86_FEATURE_HYPERVISOR is present), which, to my
understanding, can't be present on a real K8 processor as it was introduced
only with the later/other Athlon64 models.

Example output with the "lahf_lm" flag missing in the flags list
(should be shown between "hypervisor" and "abm"):

    $ cat /proc/cpuinfo
    processor       : 0
    vendor_id       : AuthenticAMD
    cpu family      : 15
    model           : 6
    model name      : Common KVM processor
    stepping        : 1
    microcode       : 0x1000065
    cpu MHz         : 2599.998
    cache size      : 512 KB
    physical id     : 0
    siblings        : 1
    core id         : 0
    cpu cores       : 1
    apicid          : 0
    initial apicid  : 0
    fpu             : yes
    fpu_exception   : yes
    cpuid level     : 13
    wp              : yes
    flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
                      cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx rdtscp
                      lm rep_good nopl cpuid extd_apicid tsc_known_freq pni
                      pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt
                      tsc_deadline_timer aes xsave avx f16c hypervisor abm
                      3dnowprefetch vmmcall bmi1 avx2 bmi2 xsaveopt

... while kcpuid shows the feature to be present in the CPU:

    # kcpuid -d | grep lahf
         lahf_lm             - LAHF/SAHF available in 64-bit mode

[ mingo: Updated the comment a bit. ]

Signed-off-by: Max Grobecker <max@grobecker.info>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Link: https://lore.kernel.org/r/533f9cf-1957-41e8-a8cc-ddce5438f658-max@grobecker.info
---
 arch/x86/kernel/cpu/amd.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 54194f5..c1f0a5f 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -631,8 +631,11 @@ static void init_amd_k8(struct cpuinfo_x86 *c)
 	 * Some BIOSes incorrectly force this feature, but only K8 revision D
 	 * (model = 0x14) and later actually support it.
 	 * (AMD Erratum #110, docId: 25759).
+	 * Only clear capability flag if we're running on baremetal,
+	 * as we might see a wrong model ID as a guest kernel. In such a case,
+	 * we can safely assume we're not affected by this erratum.
 	 */
-	if (c->x86_model < 0x14 && cpu_has(c, X86_FEATURE_LAHF_LM)) {
+	if (c->x86_model < 0x14 && cpu_has(c, X86_FEATURE_LAHF_LM) && !cpu_has(c, X86_FEATURE_HYPERVISOR)) {
 		clear_cpu_cap(c, X86_FEATURE_LAHF_LM);
 		if (!rdmsrl_amd_safe(0xc001100d, &value)) {
 			value &= ~BIT_64(32);
Re: [tip: x86/cpu] x86/cpu: Don't clear X86_FEATURE_LAHF_LM flag in init_amd_k8() on AMD when running in a virtual machine
Posted by Borislav Petkov 9 months, 2 weeks ago
On Thu, Feb 27, 2025 at 09:02:22PM -0000, tip-bot2 for Max Grobecker wrote:
> This can prevent some docker containers from starting or build scripts to create
> unoptimized binaries.

Who does docker containers with a K8 CPU model? What's the advantage?

> Admittably, this is more a small inconvenience than a severe bug in the kernel
> and the shoddy scripts that rely on parsing /proc/cpuinfo
> should be fixed instead.

Yes.

I find such "wag-the-dog" patches awful.

> This patch adds an additional check to see if we're running inside a

Avoid having "This patch" or "This commit" in the commit message. It is
tautologically useless.

Also, do

$ git grep 'This patch' Documentation/process

for more details.

> virtual machine (X86_FEATURE_HYPERVISOR is present), which, to my
> understanding, can't be present on a real K8 processor as it was introduced
> only with the later/other Athlon64 models.
> 
> Example output with the "lahf_lm" flag missing in the flags list
> (should be shown between "hypervisor" and "abm"):
> 
>     $ cat /proc/cpuinfo
>     processor       : 0
>     vendor_id       : AuthenticAMD
>     cpu family      : 15
>     model           : 6
>     model name      : Common KVM processor
>     stepping        : 1
>     microcode       : 0x1000065
>     cpu MHz         : 2599.998
>     cache size      : 512 KB
>     physical id     : 0
>     siblings        : 1
>     core id         : 0
>     cpu cores       : 1
>     apicid          : 0
>     initial apicid  : 0
>     fpu             : yes
>     fpu_exception   : yes
>     cpuid level     : 13
>     wp              : yes
>     flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
>                       cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx rdtscp
>                       lm rep_good nopl cpuid extd_apicid tsc_known_freq pni
>                       pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt
>                       tsc_deadline_timer aes xsave avx f16c hypervisor abm
>                       3dnowprefetch vmmcall bmi1 avx2 bmi2 xsaveopt

This dump is purely useless - it is clear what the code does currently. No
need to dump it.

> 
> ... while kcpuid shows the feature to be present in the CPU:
> 
>     # kcpuid -d | grep lahf
>          lahf_lm             - LAHF/SAHF available in 64-bit mode
> 
> [ mingo: Updated the comment a bit. ]
> 
> Signed-off-by: Max Grobecker <max@grobecker.info>
> Signed-off-by: Ingo Molnar <mingo@kernel.org>
> Cc: Borislav Petkov <bp@alien8.de>
> Link: https://lore.kernel.org/r/533f9cf-1957-41e8-a8cc-ddce5438f658-max@grobecker.info
> ---
>  arch/x86/kernel/cpu/amd.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
> index 54194f5..c1f0a5f 100644
> --- a/arch/x86/kernel/cpu/amd.c
> +++ b/arch/x86/kernel/cpu/amd.c
> @@ -631,8 +631,11 @@ static void init_amd_k8(struct cpuinfo_x86 *c)
>  	 * Some BIOSes incorrectly force this feature, but only K8 revision D
>  	 * (model = 0x14) and later actually support it.
>  	 * (AMD Erratum #110, docId: 25759).
> +	 * Only clear capability flag if we're running on baremetal,
> +	 * as we might see a wrong model ID as a guest kernel. In such a case,
> +	 * we can safely assume we're not affected by this erratum.
>  	 */

This comment needs to be in the commit message - we don't document every use
of X86_FEATURE_HYPERVISOR.

> -	if (c->x86_model < 0x14 && cpu_has(c, X86_FEATURE_LAHF_LM)) {
> +	if (c->x86_model < 0x14 && cpu_has(c, X86_FEATURE_LAHF_LM) && !cpu_has(c, X86_FEATURE_HYPERVISOR)) {
>  		clear_cpu_cap(c, X86_FEATURE_LAHF_LM);
>  		if (!rdmsrl_amd_safe(0xc001100d, &value)) {
>  			value &= ~BIT_64(32);

But again, I'm very sceptical about K8 and docker containers and don't buy it.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette