b/arch/x86/include/asm/cpufeatures.h | 1 b/arch/x86/kernel/cpu/bugs.c | 8 + b/arch/x86/kernel/cpu/common.c | 28 +++ b/arch/x86/kernel/cpu/microcode/intel-ucode-defs.h | 150 +++++++++++++++++++++ b/drivers/base/cpu.c | 3 b/include/linux/cpu.h | 2 6 files changed, 192 insertions(+)
From: Dave Hansen <dave.hansen@linux.intel.com>
You can't practically run old microcode and consider a system secure
these days. So, let's call old microcode what it is: a vulnerability.
Expose that vulnerability in a place that folks can find it:
/sys/devices/system/cpu/vulnerabilities/old_microcode
This is obviously imperfect. But it means that a single file can be
maintained with a single list of microcode versions and there is no need
to track which version fixed a given bug.
== Microcode Revision Discussion ==
The microcode versions in the table were generated from the Intel
microcode git repo:
29f82f7429c ("microcode-20241029 Release")
It can be argued that the versions that the kernel picks to call "old"
should be a revision or two old. Which specific version is picked is
less important to me than picking *a* version and enforcing it.
This repository contains only microcode versions that Intel has deemed
to be OS-loadable. It is quite possible that the BIOS has loaded a
newer microcode than the latest in this repo. That means that the
vulnerability can be considered to answer this question:
Are we running on the latest OS-loadable microcode,
or something even later that the BIOS loaded?
In other words, Intel never publishes an authoritative list of CPUs
and latest microcode revisions. Until it does, this is the best that
Linux can do.
Also note that the "intel-ucode-defs.h" file is simple, ugly and
has lots of magic numbers. That's on purpose and should allow a
single file to be shared across lots of stable kernel regardless of if
they have the new "VFM" infrastructure or not. It was generated with
a dumb script.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
---
b/arch/x86/include/asm/cpufeatures.h | 1
b/arch/x86/kernel/cpu/bugs.c | 8 +
b/arch/x86/kernel/cpu/common.c | 28 +++
b/arch/x86/kernel/cpu/microcode/intel-ucode-defs.h | 150 +++++++++++++++++++++
b/drivers/base/cpu.c | 3
b/include/linux/cpu.h | 2
6 files changed, 192 insertions(+)
diff -puN arch/x86/include/asm/cpufeatures.h~old-ucode-0 arch/x86/include/asm/cpufeatures.h
--- a/arch/x86/include/asm/cpufeatures.h~old-ucode-0 2024-11-06 07:51:07.364536033 -0800
+++ b/arch/x86/include/asm/cpufeatures.h 2024-11-06 07:51:07.372536037 -0800
@@ -525,4 +525,5 @@
#define X86_BUG_DIV0 X86_BUG(1*32 + 1) /* "div0" AMD DIV0 speculation bug */
#define X86_BUG_RFDS X86_BUG(1*32 + 2) /* "rfds" CPU is vulnerable to Register File Data Sampling */
#define X86_BUG_BHI X86_BUG(1*32 + 3) /* "bhi" CPU is affected by Branch History Injection */
+#define X86_BUG_OLD_MICROCODE X86_BUG(1*32 + 4) /* "old_microcode" CPU has old microcode, it is surely vulnerable to something */
#endif /* _ASM_X86_CPUFEATURES_H */
diff -puN arch/x86/kernel/cpu/common.c~old-ucode-0 arch/x86/kernel/cpu/common.c
--- a/arch/x86/kernel/cpu/common.c~old-ucode-0 2024-11-06 07:51:07.368536035 -0800
+++ b/arch/x86/kernel/cpu/common.c 2024-11-07 09:01:58.709687132 -0800
@@ -1317,6 +1317,31 @@ static bool __init vulnerable_to_rfds(u6
return cpu_matches(cpu_vuln_blacklist, RFDS);
}
+struct x86_cpu_id cpu_latest_microcdoe[] = {
+#include "microcode/intel-ucode-defs.h"
+ {}
+};
+
+static bool __init cpu_has_old_microcode(void)
+{
+ const struct x86_cpu_id *m = x86_match_cpu(cpu_latest_microcdoe);
+
+ /* Give unknown CPUs a pass: */
+ if (!m)
+ return false;
+
+ /* Consider all debug microcode to be old: */
+ if (boot_cpu_data.microcode & BIT(31))
+ return true;
+
+ /* Give new microocode a pass: */
+ if (boot_cpu_data.microcode >= m->driver_data)
+ return false;
+
+ /* Uh oh, too old: */
+ return true;
+}
+
static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
{
u64 x86_arch_cap_msr = x86_read_arch_cap_msr();
@@ -1443,6 +1468,9 @@ static void __init cpu_set_bug_bits(stru
boot_cpu_has(X86_FEATURE_HYPERVISOR)))
setup_force_cpu_bug(X86_BUG_BHI);
+ if (cpu_has_old_microcode())
+ setup_force_cpu_bug(X86_BUG_OLD_MICROCODE);
+
if (cpu_matches(cpu_vuln_whitelist, NO_MELTDOWN))
return;
diff -puN arch/x86/kernel/cpu/microcode/intel-ucode-defs.h~old-ucode-0 arch/x86/kernel/cpu/microcode/intel-ucode-defs.h
--- a/arch/x86/kernel/cpu/microcode/intel-ucode-defs.h~old-ucode-0 2024-11-06 07:51:07.368536035 -0800
+++ b/arch/x86/kernel/cpu/microcode/intel-ucode-defs.h 2024-11-07 09:01:21.269674736 -0800
@@ -0,0 +1,150 @@
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x03, .steppings = 0x0004, .driver_data = 0x2 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x05, .steppings = 0x0001, .driver_data = 0x45 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x05, .steppings = 0x0002, .driver_data = 0x40 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x05, .steppings = 0x0004, .driver_data = 0x2c }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x05, .steppings = 0x0008, .driver_data = 0x10 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x06, .steppings = 0x0001, .driver_data = 0xa }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x06, .steppings = 0x0020, .driver_data = 0x3 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x06, .steppings = 0x0400, .driver_data = 0xd }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x06, .steppings = 0x2000, .driver_data = 0x7 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x07, .steppings = 0x0002, .driver_data = 0x14 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x07, .steppings = 0x0004, .driver_data = 0x38 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x07, .steppings = 0x0008, .driver_data = 0x2e }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x08, .steppings = 0x0002, .driver_data = 0x11 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x08, .steppings = 0x0008, .driver_data = 0x8 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x08, .steppings = 0x0040, .driver_data = 0xc }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x08, .steppings = 0x0400, .driver_data = 0x5 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x09, .steppings = 0x0020, .driver_data = 0x47 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x0a, .steppings = 0x0001, .driver_data = 0x3 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x0a, .steppings = 0x0002, .driver_data = 0x1 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x0b, .steppings = 0x0002, .driver_data = 0x1d }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x0b, .steppings = 0x0010, .driver_data = 0x2 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x0d, .steppings = 0x0040, .driver_data = 0x18 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x0e, .steppings = 0x0100, .driver_data = 0x39 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x0e, .steppings = 0x1000, .driver_data = 0x59 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x0f, .steppings = 0x0004, .driver_data = 0x5d }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x0f, .steppings = 0x0040, .driver_data = 0xd2 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x0f, .steppings = 0x0080, .driver_data = 0x6b }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x0f, .steppings = 0x0400, .driver_data = 0x95 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x0f, .steppings = 0x0800, .driver_data = 0xbc }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x0f, .steppings = 0x2000, .driver_data = 0xa4 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x16, .steppings = 0x0002, .driver_data = 0x44 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x17, .steppings = 0x0040, .driver_data = 0x60f }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x17, .steppings = 0x0080, .driver_data = 0x70a }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x17, .steppings = 0x0400, .driver_data = 0xa0b }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x1a, .steppings = 0x0010, .driver_data = 0x12 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x1a, .steppings = 0x0020, .driver_data = 0x1d }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x1c, .steppings = 0x0004, .driver_data = 0x219 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x1c, .steppings = 0x0400, .driver_data = 0x107 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x1d, .steppings = 0x0002, .driver_data = 0x29 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x1e, .steppings = 0x0020, .driver_data = 0xa }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x25, .steppings = 0x0004, .driver_data = 0x11 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x25, .steppings = 0x0020, .driver_data = 0x7 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x26, .steppings = 0x0002, .driver_data = 0x105 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x2a, .steppings = 0x0080, .driver_data = 0x2f }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x2c, .steppings = 0x0004, .driver_data = 0x1f }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x2d, .steppings = 0x0040, .driver_data = 0x621 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x2d, .steppings = 0x0080, .driver_data = 0x71a }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x2e, .steppings = 0x0040, .driver_data = 0xd }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x2f, .steppings = 0x0004, .driver_data = 0x3b }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x37, .steppings = 0x0100, .driver_data = 0x838 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x37, .steppings = 0x0200, .driver_data = 0x90d }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x3a, .steppings = 0x0200, .driver_data = 0x21 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x3c, .steppings = 0x0008, .driver_data = 0x28 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x3d, .steppings = 0x0010, .driver_data = 0x2f }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x3e, .steppings = 0x0010, .driver_data = 0x42e }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x3e, .steppings = 0x0040, .driver_data = 0x600 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x3e, .steppings = 0x0080, .driver_data = 0x715 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x3f, .steppings = 0x0004, .driver_data = 0x49 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x3f, .steppings = 0x0010, .driver_data = 0x1a }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x45, .steppings = 0x0002, .driver_data = 0x26 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x46, .steppings = 0x0002, .driver_data = 0x1c }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x47, .steppings = 0x0002, .driver_data = 0x22 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x4c, .steppings = 0x0008, .driver_data = 0x368 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x4c, .steppings = 0x0010, .driver_data = 0x411 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x4d, .steppings = 0x0100, .driver_data = 0x12d }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x4e, .steppings = 0x0008, .driver_data = 0xf0 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x55, .steppings = 0x0008, .driver_data = 0x1000191 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x55, .steppings = 0x0010, .driver_data = 0x2007006 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x55, .steppings = 0x0020, .driver_data = 0x3000010 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x55, .steppings = 0x0040, .driver_data = 0x4003605 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x55, .steppings = 0x0080, .driver_data = 0x5003707 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x55, .steppings = 0x0800, .driver_data = 0x7002904 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x56, .steppings = 0x0004, .driver_data = 0x1c }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x56, .steppings = 0x0008, .driver_data = 0x700001c }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x56, .steppings = 0x0010, .driver_data = 0xf00001a }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x56, .steppings = 0x0020, .driver_data = 0xe000015 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x5c, .steppings = 0x0004, .driver_data = 0x14 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x5c, .steppings = 0x0200, .driver_data = 0x48 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x5c, .steppings = 0x0400, .driver_data = 0x28 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x5e, .steppings = 0x0008, .driver_data = 0xf0 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x5f, .steppings = 0x0002, .driver_data = 0x3e }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x66, .steppings = 0x0008, .driver_data = 0x2a }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x6a, .steppings = 0x0020, .driver_data = 0xc0002f0 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x6a, .steppings = 0x0040, .driver_data = 0xd0003e7 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x6c, .steppings = 0x0002, .driver_data = 0x10002b0 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x7a, .steppings = 0x0002, .driver_data = 0x42 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x7a, .steppings = 0x0100, .driver_data = 0x24 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x7e, .steppings = 0x0020, .driver_data = 0xc6 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x8a, .steppings = 0x0002, .driver_data = 0x33 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x8c, .steppings = 0x0002, .driver_data = 0xb8 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x8c, .steppings = 0x0004, .driver_data = 0x38 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x8d, .steppings = 0x0002, .driver_data = 0x52 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x8e, .steppings = 0x0200, .driver_data = 0xf6 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x8e, .steppings = 0x0400, .driver_data = 0xf6 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x8e, .steppings = 0x0800, .driver_data = 0xf6 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x8e, .steppings = 0x1000, .driver_data = 0xfc }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x8f, .steppings = 0x0100, .driver_data = 0x2c000390 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x8f, .steppings = 0x0080, .driver_data = 0x2b0005c0 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x8f, .steppings = 0x0040, .driver_data = 0x2c000390 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x8f, .steppings = 0x0020, .driver_data = 0x2c000390 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x8f, .steppings = 0x0010, .driver_data = 0x2c000390 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x96, .steppings = 0x0002, .driver_data = 0x1a }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x97, .steppings = 0x0004, .driver_data = 0x36 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x97, .steppings = 0x0020, .driver_data = 0x36 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0xbf, .steppings = 0x0004, .driver_data = 0x36 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0xbf, .steppings = 0x0020, .driver_data = 0x36 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x9a, .steppings = 0x0008, .driver_data = 0x434 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x9a, .steppings = 0x0010, .driver_data = 0x434 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x9c, .steppings = 0x0001, .driver_data = 0x24000026 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x9e, .steppings = 0x0200, .driver_data = 0xf8 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x9e, .steppings = 0x0400, .driver_data = 0xf8 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x9e, .steppings = 0x0800, .driver_data = 0xf6 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x9e, .steppings = 0x1000, .driver_data = 0xf8 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0x9e, .steppings = 0x2000, .driver_data = 0x100 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0xa5, .steppings = 0x0004, .driver_data = 0xfc }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0xa5, .steppings = 0x0008, .driver_data = 0xfc }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0xa5, .steppings = 0x0020, .driver_data = 0xfc }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0xa6, .steppings = 0x0001, .driver_data = 0xfe }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0xa6, .steppings = 0x0002, .driver_data = 0xfc }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0xa7, .steppings = 0x0002, .driver_data = 0x62 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0xaa, .steppings = 0x0010, .driver_data = 0x1f }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0xb7, .steppings = 0x0002, .driver_data = 0x12b }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0xba, .steppings = 0x0004, .driver_data = 0x4122 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0xba, .steppings = 0x0008, .driver_data = 0x4122 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0xba, .steppings = 0x0100, .driver_data = 0x4122 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0xbe, .steppings = 0x0001, .driver_data = 0x1a }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0xcf, .steppings = 0x0004, .driver_data = 0x21000230 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0xcf, .steppings = 0x0002, .driver_data = 0x21000230 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0xf, .model = 0x00, .steppings = 0x0080, .driver_data = 0x12 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0xf, .model = 0x00, .steppings = 0x0400, .driver_data = 0x15 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0xf, .model = 0x01, .steppings = 0x0004, .driver_data = 0x2e }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0xf, .model = 0x02, .steppings = 0x0010, .driver_data = 0x21 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0xf, .model = 0x02, .steppings = 0x0020, .driver_data = 0x2c }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0xf, .model = 0x02, .steppings = 0x0040, .driver_data = 0x10 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0xf, .model = 0x02, .steppings = 0x0080, .driver_data = 0x39 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0xf, .model = 0x02, .steppings = 0x0200, .driver_data = 0x2f }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0xf, .model = 0x03, .steppings = 0x0004, .driver_data = 0xa }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0xf, .model = 0x03, .steppings = 0x0008, .driver_data = 0xc }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0xf, .model = 0x03, .steppings = 0x0010, .driver_data = 0x17 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0xf, .model = 0x04, .steppings = 0x0002, .driver_data = 0x17 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0xf, .model = 0x04, .steppings = 0x0008, .driver_data = 0x5 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0xf, .model = 0x04, .steppings = 0x0010, .driver_data = 0x6 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0xf, .model = 0x04, .steppings = 0x0080, .driver_data = 0x3 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0xf, .model = 0x04, .steppings = 0x0100, .driver_data = 0xe }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0xf, .model = 0x04, .steppings = 0x0200, .driver_data = 0x3 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0xf, .model = 0x04, .steppings = 0x0400, .driver_data = 0x4 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0xf, .model = 0x06, .steppings = 0x0004, .driver_data = 0xf }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0xf, .model = 0x06, .steppings = 0x0010, .driver_data = 0x4 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0xf, .model = 0x06, .steppings = 0x0020, .driver_data = 0x8 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0xf, .model = 0x06, .steppings = 0x0100, .driver_data = 0x9 }
diff -puN arch/x86/kernel/cpu/bugs.c~old-ucode-0 arch/x86/kernel/cpu/bugs.c
--- a/arch/x86/kernel/cpu/bugs.c~old-ucode-0 2024-11-06 07:58:23.256734986 -0800
+++ b/arch/x86/kernel/cpu/bugs.c 2024-11-06 08:31:44.045967210 -0800
@@ -2945,6 +2945,9 @@ static ssize_t cpu_show_common(struct de
case X86_BUG_RFDS:
return rfds_show_state(buf);
+ case X86_BUG_OLD_MICROCODE:
+ return sysfs_emit(buf, "Vulnerable\n");
+
default:
break;
}
@@ -3024,6 +3027,11 @@ ssize_t cpu_show_reg_file_data_sampling(
{
return cpu_show_common(dev, attr, buf, X86_BUG_RFDS);
}
+
+ssize_t cpu_show_old_microcode(struct device *dev, struct device_attribute *attr, char *buf)
+{
+ return cpu_show_common(dev, attr, buf, X86_BUG_OLD_MICROCODE);
+}
#endif
void __warn_thunk(void)
diff -puN drivers/base/cpu.c~old-ucode-0 drivers/base/cpu.c
--- a/drivers/base/cpu.c~old-ucode-0 2024-11-06 08:26:51.505813735 -0800
+++ b/drivers/base/cpu.c 2024-11-06 08:29:56.925911521 -0800
@@ -599,6 +599,7 @@ CPU_SHOW_VULN_FALLBACK(retbleed);
CPU_SHOW_VULN_FALLBACK(spec_rstack_overflow);
CPU_SHOW_VULN_FALLBACK(gds);
CPU_SHOW_VULN_FALLBACK(reg_file_data_sampling);
+CPU_SHOW_VULN_FALLBACK(old_microcode);
static DEVICE_ATTR(meltdown, 0444, cpu_show_meltdown, NULL);
static DEVICE_ATTR(spectre_v1, 0444, cpu_show_spectre_v1, NULL);
@@ -614,6 +615,7 @@ static DEVICE_ATTR(retbleed, 0444, cpu_s
static DEVICE_ATTR(spec_rstack_overflow, 0444, cpu_show_spec_rstack_overflow, NULL);
static DEVICE_ATTR(gather_data_sampling, 0444, cpu_show_gds, NULL);
static DEVICE_ATTR(reg_file_data_sampling, 0444, cpu_show_reg_file_data_sampling, NULL);
+static DEVICE_ATTR(old_microcode, 0444, cpu_show_old_microcode, NULL);
static struct attribute *cpu_root_vulnerabilities_attrs[] = {
&dev_attr_meltdown.attr,
@@ -630,6 +632,7 @@ static struct attribute *cpu_root_vulner
&dev_attr_spec_rstack_overflow.attr,
&dev_attr_gather_data_sampling.attr,
&dev_attr_reg_file_data_sampling.attr,
+ &dev_attr_old_microcode.attr,
NULL
};
diff -puN include/linux/cpu.h~old-ucode-0 include/linux/cpu.h
--- a/include/linux/cpu.h~old-ucode-0 2024-11-06 08:32:44.333998355 -0800
+++ b/include/linux/cpu.h 2024-11-06 08:33:02.998007967 -0800
@@ -77,6 +77,8 @@ extern ssize_t cpu_show_gds(struct devic
struct device_attribute *attr, char *buf);
extern ssize_t cpu_show_reg_file_data_sampling(struct device *dev,
struct device_attribute *attr, char *buf);
+extern ssize_t cpu_show_old_microcode(struct device *dev,
+ struct device_attribute *attr, char *buf);
extern __printf(4, 5)
struct device *cpu_device_create(struct device *parent, void *drvdata,
_
On Thu, Nov 07, 2024 at 09:06:30AM -0800, Dave Hansen wrote: > > From: Dave Hansen <dave.hansen@linux.intel.com> > > You can't practically run old microcode and consider a system secure > these days. So, let's call old microcode what it is: a vulnerability. > Expose that vulnerability in a place that folks can find it: > > /sys/devices/system/cpu/vulnerabilities/old_microcode Sorry for playing the devil's advocate. I am wondering who is the prime beneficiary of this change? Roughly dividing the user base into: 1. People who get their updates from distro. As distros also provide the microcode, most likely, their kernel will be patched to agree that the microcode that they provide has latest security fixes. Effectively distros have the control over what the kernel reports. 2. People who get their updates from distro, but build their own kernel could benefit from this change. Broadly these would be CSPs/embedded vendors/developers etc. - I am assuming CSPs are well versed with the microcode updates and hand-pick the microcode that they want to apply. So, they may not be care too much about microcode being old. And majority of their users that run workload in a guest VM won't see the microcode version. - In my experience, embedded vendors generally take a very long time to provide updates. They could benefit from this change when they eventually update their kernel. - Expert users/developers who submit bug reports to mailing lists can now know that they are running old microcode, and should update their microcode before submitting a bug report. To me they would benefit the most from this change. For this to be help category 1. users, we need blessing from distro providers. It would be great if more and more distros provide their agreement/feedback on this change, as they are the ones who would enforce this change.
On 11/19/24 09:45, Pawan Gupta wrote: > Sorry for playing the devil's advocate. I am wondering who is the prime > beneficiary of this change? At a very high level, it's for folks with new kernels and old microcode. It's _very_ normal for someone to report a bug and for us upstream folks to ask them to reproduce on the latest mainline. The moment they do that, they get the latest microcode list. Folks don't randomly upgrade to a new kernel for fun in production. But it's hopefully a very normal activity for folks having problems and launching into debug. In other words, "new kernel / old microcode" might be relatively rare, but it still gets used at a *very* critical choke point. I completely agree with your general sentiment that normal distro users will get the distro-kernel-provided microcode version list _and_ distro-provided microcode files. This won't help them one bit unless the distro makes a silly mistake, doesn't do testing, or they somehow upgrade one package without the other.
On Tue, Nov 19, 2024 at 10:49:21AM -0800, Dave Hansen wrote: > On 11/19/24 09:45, Pawan Gupta wrote: > > Sorry for playing the devil's advocate. I am wondering who is the prime > > beneficiary of this change? > > At a very high level, it's for folks with new kernels and old microcode. > > It's _very_ normal for someone to report a bug and for us upstream folks > to ask them to reproduce on the latest mainline. The moment they do > that, they get the latest microcode list. Folks don't randomly upgrade > to a new kernel for fun in production. But it's hopefully a very normal > activity for folks having problems and launching into debug. Ah, that makes sense. > In other words, "new kernel / old microcode" might be relatively rare, > but it still gets used at a *very* critical choke point. Right.
> > == Microcode Revision Discussion == > > The microcode versions in the table were generated from the Intel > microcode git repo: > > 29f82f7429c ("microcode-20241029 Release") This upstream microcode release only contained an update for a functional issue[1] - not any fixes for security issues. So it would not really be correct to say a machine running the previous microcode revision is vulnerable. As such, should the table of microcode revisions only be generated from the upstream microcode releases that contain fixes for security issues? ie. > +{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0xb7, .steppings = 0x0002, .driver_data = 0x12b } should ideally be: > +{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6, .model = 0xb7, .steppings = 0x0002, .driver_data = 0x129 } to correspond with the previous microcode release that contained actual security fixes. [1] https://github.com/intel/Intel-Linux-Processor-Microcode-Data-Files/releases/tag/microcode-20241029
>> == Microcode Revision Discussion == The microcode versions in the >> table were generated from the Intel microcode git repo: 29f82f7429c >> ("microcode-20241029 Release") > This upstream microcode release only contained an update for a > functional issue[1] - not any fixes for security issues. Now I can point at them, see the release notes for 8ac9378a8487 ("microcode-20241112 Release"). Note how it's admitting to have fixed security issues silently in prior drops. If I were you, I wouldn't make assumptions based on what's not said in the release notes. I can count on one hand the number of drops in that repo which I know (or reasonably suspect) to be "functional issues only", but the one you happened to reference is a fix for "this CPU overvolts itself to an early grave". Many would consider this a denial of service, against ones wallet if nothing else. ~Andrew
On 11/13/24 18:09, Andrew Cooper wrote: > Note how it's admitting to have fixed security issues silently in prior > drops. If I were you, I wouldn't make assumptions based on what's not > said in the release notes. I've gotten two pieces of feedback. Paraphrasing, one bit of feedback from Andrew says: Don't explicitly trust the release notes. Here's an active, super recent example of when you can't trust them. and Alex (elsewhere in the thread) says: The kernel should explicitly trust the release notes (for security vulnerability statements at least) I'm partial to Andrew's position here because of the real-world recent evidence. It also occurs to me that Intel could have done this for two reasons: First, it could be an attempt to do coordinated disclosure when the microcode to fix the issue is ready well in advance of an issue being disclosed. The second is that a human made an error and neglected to mention the security issues in the release notes. We can fix the first issue by asking my Intel colleagues to not do that in the future. I'd be happy to do that if folks want. But we can't fix the second issue until we have either infallible humans (or AI). I'm not sure either one is on the horizon. ;)
On 11/11/24 22:37, Alex Murray wrote: >> == Microcode Revision Discussion == >> >> The microcode versions in the table were generated from the Intel >> microcode git repo: >> >> 29f82f7429c ("microcode-20241029 Release") > > This upstream microcode release only contained an update for a > functional issue[1] - not any fixes for security issues. So it would not > really be correct to say a machine running the previous microcode > revision is vulnerable. There are literally two things this patch "says". One is in userspace and can be literally read as: /sys/devices/system/cpu/vulnerabilities/old_microcode "You are vulnerable to old CPU microcode". The other is in the code: X86_BUG_OLD_MICROCODE. Which can literally be read to say "you have a CPU bug called 'old microcode'. (Oh, and I guess this comes out in /proc/cpuinfo too). If you think this is confusing, we can document our way out of it or revise the changelog. But we kinda get to define what the file and the X86_BUG mean in the first place. I don't really see how it's possible to argue that they're "incorrect". > As such, should the table of microcode revisions only be generated > from the upstream microcode releases that contain fixes for security > issues? No, I don't think so. First, I honestly don't want to have this discussion every three months where folks can argue about whether a given microcode release is functional or security. Or, even worse, which individual microcode *image* is which. Second, running kernels with functional issues is *BAD*. As a kernel policy, we don't want users running with old microcode. Security bugs only hurt our users but functional bugs hurt the kernel too because users blame the kernel when they hit them and kernel developers spend time chasing those issues down. So I guess it boils down to: First, should we tell users when their microcode is old? If so, how should we do it?
On 12.11.24 г. 17:51 ч., Dave Hansen wrote: > On 11/11/24 22:37, Alex Murray wrote: >>> == Microcode Revision Discussion == >>> >>> The microcode versions in the table were generated from the Intel >>> microcode git repo: >>> >>> 29f82f7429c ("microcode-20241029 Release") >> >> This upstream microcode release only contained an update for a >> functional issue[1] - not any fixes for security issues. So it would not >> really be correct to say a machine running the previous microcode >> revision is vulnerable. > > There are literally two things this patch "says". One is in userspace > and can be literally read as: > > /sys/devices/system/cpu/vulnerabilities/old_microcode > > "You are vulnerable to old CPU microcode". > > The other is in the code: X86_BUG_OLD_MICROCODE. Which can literally be > read to say "you have a CPU bug called 'old microcode'. (Oh, and I guess > this comes out in /proc/cpuinfo too). > > If you think this is confusing, we can document our way out of it or > revise the changelog. But we kinda get to define what the file and the > X86_BUG mean in the first place. > > I don't really see how it's possible to argue that they're "incorrect". > >> As such, should the table of microcode revisions only be generated >> from the upstream microcode releases that contain fixes for security >> issues? > > No, I don't think so. First, I honestly don't want to have this > discussion every three months where folks can argue about whether a > given microcode release is functional or security. Or, even worse, > which individual microcode *image* is which. > > Second, running kernels with functional issues is *BAD*. As a kernel > policy, we don't want users running with old microcode. Security bugs > only hurt our users but functional bugs hurt the kernel too because > users blame the kernel when they hit them and kernel developers spend > time chasing those issues down. <Perhaps offtopic> Probably the same reasoning can be applied here as for the CVEs - since the kernel (microcode) is a very fundamental piece of software, almost any issue can be treated as a security one (at least judging from the influx of automatically generated CVEs). By the same token we can assume that microcode always fixes a critical issue :) </Perhaps offtopic> > > So I guess it boils down to: First, should we tell users when their > microcode is old? If so, how should we do it? >
On Tue, 2024-11-12 at 07:51:38 -0800, Dave Hansen wrote: > On 11/11/24 22:37, Alex Murray wrote: >>> == Microcode Revision Discussion == >>> >>> The microcode versions in the table were generated from the Intel >>> microcode git repo: >>> >>> 29f82f7429c ("microcode-20241029 Release") >> >> This upstream microcode release only contained an update for a >> functional issue[1] - not any fixes for security issues. So it would not >> really be correct to say a machine running the previous microcode >> revision is vulnerable. > > There are literally two things this patch "says". One is in userspace > and can be literally read as: > > /sys/devices/system/cpu/vulnerabilities/old_microcode > > "You are vulnerable to old CPU microcode". > > The other is in the code: X86_BUG_OLD_MICROCODE. Which can literally be > read to say "you have a CPU bug called 'old microcode'. (Oh, and I guess > this comes out in /proc/cpuinfo too). > > If you think this is confusing, we can document our way out of it or > revise the changelog. But we kinda get to define what the file and the > X86_BUG mean in the first place. > > I don't really see how it's possible to argue that they're > "incorrect". My point is that if a given microcode contains only functional updates, then if you are *not* using it you do not have a security vulnerability. If however the specified microcode revision fixes a known security issue then yes I agree, there is a vulnerability and if you are not using this microcode revision you are vulnerable to it. It is really the distinction between a microcode update that is purely for functional issues compared to one that is for security issues as well. > >> As such, should the table of microcode revisions only be generated >> from the upstream microcode releases that contain fixes for security >> issues? > > No, I don't think so. First, I honestly don't want to have this > discussion every three months where folks can argue about whether a > given microcode release is functional or security. Or, even worse, > which individual microcode *image* is which. I don't think there is an argument here - releases at https://github.com/intel/Intel-Linux-Processor-Microcode-Data-Files clearly say if they contain Security updates or updates for functional issues - so if a release like the previous 20241029 one only contains an update for functional issues it should not be treated as a security issue if a system is not running it. > > Second, running kernels with functional issues is *BAD*. As a kernel > policy, we don't want users running with old microcode. Security bugs > only hurt our users but functional bugs hurt the kernel too because > users blame the kernel when they hit them and kernel developers spend > time chasing those issues down. > But just because something is bad that doesn't mean it is a security vulnerability. One option could be to taint the kernel in this case instead. > So I guess it boils down to: First, should we tell users when their > microcode is old? If so, how should we do it? So I suggest instead if you really want to flag old microcode as an issue you could taint it as such since the description of tainted is > The kernel will mark itself as ‘tainted’ when something occurs that > might be relevant later when investigating problems which feels like exactly the kind of semantics you describe above. Then if you also want to surface old microcode that also is missing security fixes you could then use your new proposed mechanism.
On 11/12/24 19:29, Alex Murray wrote: >> No, I don't think so. First, I honestly don't want to have this >> discussion every three months where folks can argue about whether a >> given microcode release is functional or security. Or, even worse, >> which individual microcode *image* is which. > I don't think there is an argument here - releases at > https://github.com/intel/Intel-Linux-Processor-Microcode-Data-Files > clearly say if they contain Security updates or updates for functional > issues - so if a release like the previous 20241029 one only contains an > update for functional issues it should not be treated as a security > issue if a system is not running it. While I applaud your trust in my employer, I don't see quite as bright of a line between security and functional problems. Here's the bottom line: I agree that setting a taint flag for old microcode seems like a good idea. But I also think that there's enough of a "vulnerability" (security or otherwise) to justify placing "old_microcode" alongside the CPU security vulnerabilities that have known exploits. I'm lazy and don't want to read and filter the microcode changelogs. I also don't want to have to trust my colleagues to precisely agree on where that line is between a security and functional problem. So I'm leaning toward setting: TAINT_CPU_OUT_OF_SPEC plus X86_BUG_OLD_MICROCODE and calling it a day.
On Wed, 2024-11-13 at 08:00:26 -0800, Dave Hansen wrote: > While I applaud your trust in my employer, I don't see quite as bright > of a line between security and functional problems. > > Here's the bottom line: I agree that setting a taint flag for old > microcode seems like a good idea. But I also think that there's enough > of a "vulnerability" (security or otherwise) to justify placing > "old_microcode" alongside the CPU security vulnerabilities that have > known exploits. > > I'm lazy and don't want to read and filter the microcode changelogs. I > also don't want to have to trust my colleagues to precisely agree on > where that line is between a security and functional problem. > The only other data point then to mention is that all the major distros (Debian[1], Ubuntu[2] and Fedora[3]) are still only shipping the previous security update release (20240910) in their stable releases - *not* the more recent release with the functional updates in 20241029 - in which case anyone running a current stable release would then show as being "vulnerable". I can't speak for the other distros, but for Ubuntu we generally only ship things which are called out as specific security fixes in our security updates *and* we generally prioritise security updates over bug fixes (which these 'functional' updates appear be rather than fixing actual exploitable security issues). > So I'm leaning toward setting: > > TAINT_CPU_OUT_OF_SPEC > plus > X86_BUG_OLD_MICROCODE > > and calling it a day. Does this mean you are thinking of dropping the userspace entry in the cpu vulnerablities sysfs tree? If so then I am not so concerned, since my primary concern is having something which looks scary to users/sysadmins ("your CPU has an unpatched vulnerablity") which they can't do anything about since their distribution has a different definition of what counts as a security update compared to the upstream kernel maintainers. If the sysfs entry is dropped then this is not so visible to end-users and hence there is less panic. [1] https://packages.debian.org/search?keywords=intel-microcode [2] https://launchpad.net/ubuntu/+source/intel-microcode [3] https://packages.fedoraproject.org/pkgs/microcode_ctl/microcode_ctl/fedora-41.html
On 11/13/24 15:58, Alex Murray wrote: ... > The only other data point then to mention is that all the major distros > (Debian[1], Ubuntu[2] and Fedora[3]) are still only shipping the previous > security update release (20240910) in their stable releases - *not* the > more recent release with the functional updates in 20241029 - in which > case anyone running a current stable release would then show as being > "vulnerable". I can't speak for the other distros, but for Ubuntu we > generally only ship things which are called out as specific security > fixes in our security updates *and* we generally prioritise security > updates over bug fixes (which these 'functional' updates appear be > rather than fixing actual exploitable security issues). That's a very important data point. Thanks for that. Like I said in the original changelog, I'm open to relaxing things to define old to allow folks to be a release or two behind. But I'd want to hear a lot more about _why_ the distros lag. I'd probably also have some chats to see what other folks at Intel think about it. So what would you propose the rules be? Are you suggesting that we go through the microcode changelogs for each CPU for each release and only update the "old" revisions for security issues? If there were only functional issues fixed for, say, 2 years, on a CPU would the "old" version get updated? >> So I'm leaning toward setting: >> >> TAINT_CPU_OUT_OF_SPEC >> plus >> X86_BUG_OLD_MICROCODE >> >> and calling it a day. > > Does this mean you are thinking of dropping the userspace entry in the > cpu vulnerablities sysfs tree? No, I plan to keep X86_BUG_OLD_MICROCODE and the corresponding sysfs entry. > If so then I am not so concerned, since my primary concern is having > something which looks scary to users/sysadmins ("your CPU has an > unpatched vulnerablity") which they can't do anything about since > their distribution has a different definition of what counts as a > security update compared to the upstream kernel maintainers. If the > sysfs entry is dropped then this is not so visible to end-users and > hence there is less panic. Right, we don't want to unnecessarily scare anyone. But if a distro is being too slow in getting microcode out, then it would be good to inform users about known functional or security gaps they're exposed to. That's the thing we need to focus on. Not: "Can users do anything about it?" Rather: "What's best for the users?"
On Wed, 2024-11-13 at 16:37:31 -0800, Dave Hansen wrote: > On 11/13/24 15:58, Alex Murray wrote: > ... >> The only other data point then to mention is that all the major distros >> (Debian[1], Ubuntu[2] and Fedora[3]) are still only shipping the previous >> security update release (20240910) in their stable releases - *not* the >> more recent release with the functional updates in 20241029 - in which >> case anyone running a current stable release would then show as being >> "vulnerable". I can't speak for the other distros, but for Ubuntu we >> generally only ship things which are called out as specific security >> fixes in our security updates *and* we generally prioritise security >> updates over bug fixes (which these 'functional' updates appear be >> rather than fixing actual exploitable security issues). > > That's a very important data point. Thanks for that. > > Like I said in the original changelog, I'm open to relaxing things to > define old to allow folks to be a release or two behind. But I'd want to > hear a lot more about _why_ the distros lag. I'd probably also have some > chats to see what other folks at Intel think about it. Again, I can't speak for other distros but for Ubuntu see my comment above re prioritising security vs functional updates. > > So what would you propose the rules be? Are you suggesting that we go > through the microcode changelogs for each CPU for each release and only > update the "old" revisions for security issues? If there were only > functional issues fixed for, say, 2 years, on a CPU would the "old" > version get updated? For calling out old microcode as a vulnerability, yes I would prefer that only releases which your colleagues state as fixing security issues get included. However, for the tainted case, anything older than the current release would make sense. In which case you would have to maintain two different revision IDs per MCU - one which is the latest, and the other which is the latest with a security fix for a given platform. From my experience though it is a more rare occasion that a new upstream microcode release does not contain some security fixes. So perhaps this distinction will be mostly irrelevant in practice assuming most all MCU releases contain a fix for some security issue. > >>> So I'm leaning toward setting: >>> >>> TAINT_CPU_OUT_OF_SPEC >>> plus >>> X86_BUG_OLD_MICROCODE >>> >>> and calling it a day. >> >> Does this mean you are thinking of dropping the userspace entry in the >> cpu vulnerablities sysfs tree? > > No, I plan to keep X86_BUG_OLD_MICROCODE and the corresponding sysfs entry. > >> If so then I am not so concerned, since my primary concern is having >> something which looks scary to users/sysadmins ("your CPU has an >> unpatched vulnerablity") which they can't do anything about since >> their distribution has a different definition of what counts as a >> security update compared to the upstream kernel maintainers. If the >> sysfs entry is dropped then this is not so visible to end-users and >> hence there is less panic. > > Right, we don't want to unnecessarily scare anyone. > > But if a distro is being too slow in getting microcode out, then it > would be good to inform users about known functional or security gaps > they're exposed to. > > That's the thing we need to focus on. Not: "Can users do anything about > it?" Rather: "What's best for the users?" Yep I agree, end-users should always be the primary concern especially for new user visible things like new entries in the vulnerabilities sysfs tree. Also I am not averse to calling out the situation of running an out-of-date microcode *which has known security issues* as a vulnerability, I think providing more data to users to help them make the best assessment of any given risk is always a good thing, but we just need to be mindful to do it in a way that is hopefully actionable as well.
> You can't practically run old microcode and consider a system secure > these days. So, let's call old microcode what it is: a vulnerability. The list becomes stale 4 times a year, so you need to identify when it's out of date, and whatever that something is has to be strong enough to cause distros to backport too. Perhaps a date in the header, so you can at least report "status vulnerable, metadata out of date". Also, you want to identify EOL CPUs. Just because they're on the most recent published ucode doesn't mean they're not vulnerable. Under some hypervisors, you get fed the revision 0x7fffffff. Others might tell you the truth, or it may be the truth from when you booted. For this, probably best to say "consult your hypervisor". Failure to publish information, or not publishing fixes for in-support parts should be considered a vulnerability. (*ahem*, AMD) Or you could just simplify the whole path to "yes". It's true, even if people don't know. I really want to like this, but it's a giant can of worms, with as many political challenges as technical. ~Andrew P.S. I do like that you've labelled debug microcode as vulnerable. It's just software in a different form factor, and we know how buggy software generally is.
On 11/8/24 15:36, Andrew Cooper wrote: >> You can't practically run old microcode and consider a system secure >> these days. So, let's call old microcode what it is: a vulnerability. > > The list becomes stale 4 times a year, so you need to identify when it's > out of date, and whatever that something is has to be strong enough to > cause distros to backport too. Perhaps a date in the header, so you can > at least report "status vulnerable, metadata out of date". I don't want to get too fancy about this. I'm assuming that mainline and the stable kernels will be regularly fed new metadata. The only way to have out-of-date metadata should be by running an out-of-date kernel in which case you have bigger problems on your hands. > Also, you want to identify EOL CPUs. Just because they're on the most > recent published ucode doesn't mean they're not vulnerable. That's a good idea too. But I think it deserves a separate discussion and separate patch. > Under some hypervisors, you get fed the revision 0x7fffffff. Others > might tell you the truth, or it may be the truth from when you booted. > For this, probably best to say "consult your hypervisor". Good point. We should probably just say "unknown" when running as a guest, or just not have the sysfs file at all. > Failure to publish information, or not publishing fixes for in-support > parts should be considered a vulnerability. (*ahem*, AMD) > > Or you could just simplify the whole path to "yes". It's true, even if > people don't know. This series answers the question: Has the vendor published a newer OS-loadable microcode than you are running right now? It doesn't seek to answer the question: Is the microcode that you are running right now vulnerable to anything (that the kernel knows about)? I think the first question is quite answerable in a pretty factual way. The second question is much hardware. It's worth answering for sure ... with another patch. :)
© 2016 - 2024 Red Hat, Inc.