[RFC][PATCH] x86/cpu/bugs: Consider having old Intel microcode to be a vulnerability

Dave Hansen posted 1 patch 2 weeks, 2 days ago
b/arch/x86/include/asm/cpufeatures.h               |    1
b/arch/x86/kernel/cpu/bugs.c                       |    8 +
b/arch/x86/kernel/cpu/common.c                     |   28 +++
b/arch/x86/kernel/cpu/microcode/intel-ucode-defs.h |  150 +++++++++++++++++++++
b/drivers/base/cpu.c                               |    3
b/include/linux/cpu.h                              |    2
6 files changed, 192 insertions(+)
[RFC][PATCH] x86/cpu/bugs: Consider having old Intel microcode to be a vulnerability
Posted by Dave Hansen 2 weeks, 2 days ago

From: Dave Hansen <dave.hansen@linux.intel.com>

You can't practically run old microcode and consider a system secure
these days.  So, let's call old microcode what it is: a vulnerability.
Expose that vulnerability in a place that folks can find it:

	/sys/devices/system/cpu/vulnerabilities/old_microcode

This is obviously imperfect.  But it means that a single file can be
maintained with a single list of microcode versions and there is no need
to track which version fixed a given bug.

== Microcode Revision Discussion ==

The microcode versions in the table were generated from the Intel
microcode git repo:

	29f82f7429c ("microcode-20241029 Release")

It can be argued that the versions that the kernel picks to call "old"
should be a revision or two old.  Which specific version is picked is
less important to me than picking *a* version and enforcing it.

This repository contains only microcode versions that Intel has deemed
to be OS-loadable.  It is quite possible that the BIOS has loaded a
newer microcode than the latest in this repo.  That means that the
vulnerability can be considered to answer this question:

	Are we running on the latest OS-loadable microcode,
	or something even later that the BIOS loaded?

In other words, Intel never publishes an authoritative list of CPUs
and latest microcode revisions.  Until it does, this is the best that
Linux can do.

Also note that the "intel-ucode-defs.h" file is simple, ugly and
has lots of magic numbers.  That's on purpose and should allow a
single file to be shared across lots of stable kernel regardless of if
they have the new "VFM" infrastructure or not.  It was generated with
a dumb script.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
---

 b/arch/x86/include/asm/cpufeatures.h               |    1 
 b/arch/x86/kernel/cpu/bugs.c                       |    8 +
 b/arch/x86/kernel/cpu/common.c                     |   28 +++
 b/arch/x86/kernel/cpu/microcode/intel-ucode-defs.h |  150 +++++++++++++++++++++
 b/drivers/base/cpu.c                               |    3 
 b/include/linux/cpu.h                              |    2 
 6 files changed, 192 insertions(+)

diff -puN arch/x86/include/asm/cpufeatures.h~old-ucode-0 arch/x86/include/asm/cpufeatures.h
--- a/arch/x86/include/asm/cpufeatures.h~old-ucode-0	2024-11-06 07:51:07.364536033 -0800
+++ b/arch/x86/include/asm/cpufeatures.h	2024-11-06 07:51:07.372536037 -0800
@@ -525,4 +525,5 @@
 #define X86_BUG_DIV0			X86_BUG(1*32 + 1) /* "div0" AMD DIV0 speculation bug */
 #define X86_BUG_RFDS			X86_BUG(1*32 + 2) /* "rfds" CPU is vulnerable to Register File Data Sampling */
 #define X86_BUG_BHI			X86_BUG(1*32 + 3) /* "bhi" CPU is affected by Branch History Injection */
+#define X86_BUG_OLD_MICROCODE		X86_BUG(1*32 + 4) /* "old_microcode" CPU has old microcode, it is surely vulnerable to something */
 #endif /* _ASM_X86_CPUFEATURES_H */
diff -puN arch/x86/kernel/cpu/common.c~old-ucode-0 arch/x86/kernel/cpu/common.c
--- a/arch/x86/kernel/cpu/common.c~old-ucode-0	2024-11-06 07:51:07.368536035 -0800
+++ b/arch/x86/kernel/cpu/common.c	2024-11-07 09:01:58.709687132 -0800
@@ -1317,6 +1317,31 @@ static bool __init vulnerable_to_rfds(u6
 	return cpu_matches(cpu_vuln_blacklist, RFDS);
 }
 
+struct x86_cpu_id cpu_latest_microcdoe[] = {
+#include "microcode/intel-ucode-defs.h"
+	{}
+};
+
+static bool __init cpu_has_old_microcode(void)
+{
+	const struct x86_cpu_id *m = x86_match_cpu(cpu_latest_microcdoe);
+
+	/* Give unknown CPUs a pass: */
+	if (!m)
+		return false;
+
+	/* Consider all debug microcode to be old: */
+	if (boot_cpu_data.microcode & BIT(31))
+		return true;
+
+	/* Give new microocode a pass: */
+	if (boot_cpu_data.microcode >= m->driver_data)
+		return false;
+
+	/* Uh oh, too old: */
+	return true;
+}
+
 static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
 {
 	u64 x86_arch_cap_msr = x86_read_arch_cap_msr();
@@ -1443,6 +1468,9 @@ static void __init cpu_set_bug_bits(stru
 	     boot_cpu_has(X86_FEATURE_HYPERVISOR)))
 		setup_force_cpu_bug(X86_BUG_BHI);
 
+	if (cpu_has_old_microcode())
+		setup_force_cpu_bug(X86_BUG_OLD_MICROCODE);
+
 	if (cpu_matches(cpu_vuln_whitelist, NO_MELTDOWN))
 		return;
 
diff -puN arch/x86/kernel/cpu/microcode/intel-ucode-defs.h~old-ucode-0 arch/x86/kernel/cpu/microcode/intel-ucode-defs.h
--- a/arch/x86/kernel/cpu/microcode/intel-ucode-defs.h~old-ucode-0	2024-11-06 07:51:07.368536035 -0800
+++ b/arch/x86/kernel/cpu/microcode/intel-ucode-defs.h	2024-11-07 09:01:21.269674736 -0800
@@ -0,0 +1,150 @@
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x03, .steppings = 0x0004, .driver_data = 0x2 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x05, .steppings = 0x0001, .driver_data = 0x45 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x05, .steppings = 0x0002, .driver_data = 0x40 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x05, .steppings = 0x0004, .driver_data = 0x2c }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x05, .steppings = 0x0008, .driver_data = 0x10 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x06, .steppings = 0x0001, .driver_data = 0xa }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x06, .steppings = 0x0020, .driver_data = 0x3 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x06, .steppings = 0x0400, .driver_data = 0xd }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x06, .steppings = 0x2000, .driver_data = 0x7 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x07, .steppings = 0x0002, .driver_data = 0x14 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x07, .steppings = 0x0004, .driver_data = 0x38 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x07, .steppings = 0x0008, .driver_data = 0x2e }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x08, .steppings = 0x0002, .driver_data = 0x11 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x08, .steppings = 0x0008, .driver_data = 0x8 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x08, .steppings = 0x0040, .driver_data = 0xc }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x08, .steppings = 0x0400, .driver_data = 0x5 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x09, .steppings = 0x0020, .driver_data = 0x47 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x0a, .steppings = 0x0001, .driver_data = 0x3 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x0a, .steppings = 0x0002, .driver_data = 0x1 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x0b, .steppings = 0x0002, .driver_data = 0x1d }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x0b, .steppings = 0x0010, .driver_data = 0x2 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x0d, .steppings = 0x0040, .driver_data = 0x18 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x0e, .steppings = 0x0100, .driver_data = 0x39 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x0e, .steppings = 0x1000, .driver_data = 0x59 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x0f, .steppings = 0x0004, .driver_data = 0x5d }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x0f, .steppings = 0x0040, .driver_data = 0xd2 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x0f, .steppings = 0x0080, .driver_data = 0x6b }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x0f, .steppings = 0x0400, .driver_data = 0x95 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x0f, .steppings = 0x0800, .driver_data = 0xbc }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x0f, .steppings = 0x2000, .driver_data = 0xa4 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x16, .steppings = 0x0002, .driver_data = 0x44 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x17, .steppings = 0x0040, .driver_data = 0x60f }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x17, .steppings = 0x0080, .driver_data = 0x70a }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x17, .steppings = 0x0400, .driver_data = 0xa0b }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x1a, .steppings = 0x0010, .driver_data = 0x12 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x1a, .steppings = 0x0020, .driver_data = 0x1d }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x1c, .steppings = 0x0004, .driver_data = 0x219 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x1c, .steppings = 0x0400, .driver_data = 0x107 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x1d, .steppings = 0x0002, .driver_data = 0x29 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x1e, .steppings = 0x0020, .driver_data = 0xa }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x25, .steppings = 0x0004, .driver_data = 0x11 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x25, .steppings = 0x0020, .driver_data = 0x7 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x26, .steppings = 0x0002, .driver_data = 0x105 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x2a, .steppings = 0x0080, .driver_data = 0x2f }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x2c, .steppings = 0x0004, .driver_data = 0x1f }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x2d, .steppings = 0x0040, .driver_data = 0x621 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x2d, .steppings = 0x0080, .driver_data = 0x71a }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x2e, .steppings = 0x0040, .driver_data = 0xd }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x2f, .steppings = 0x0004, .driver_data = 0x3b }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x37, .steppings = 0x0100, .driver_data = 0x838 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x37, .steppings = 0x0200, .driver_data = 0x90d }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x3a, .steppings = 0x0200, .driver_data = 0x21 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x3c, .steppings = 0x0008, .driver_data = 0x28 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x3d, .steppings = 0x0010, .driver_data = 0x2f }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x3e, .steppings = 0x0010, .driver_data = 0x42e }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x3e, .steppings = 0x0040, .driver_data = 0x600 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x3e, .steppings = 0x0080, .driver_data = 0x715 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x3f, .steppings = 0x0004, .driver_data = 0x49 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x3f, .steppings = 0x0010, .driver_data = 0x1a }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x45, .steppings = 0x0002, .driver_data = 0x26 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x46, .steppings = 0x0002, .driver_data = 0x1c }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x47, .steppings = 0x0002, .driver_data = 0x22 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x4c, .steppings = 0x0008, .driver_data = 0x368 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x4c, .steppings = 0x0010, .driver_data = 0x411 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x4d, .steppings = 0x0100, .driver_data = 0x12d }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x4e, .steppings = 0x0008, .driver_data = 0xf0 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x55, .steppings = 0x0008, .driver_data = 0x1000191 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x55, .steppings = 0x0010, .driver_data = 0x2007006 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x55, .steppings = 0x0020, .driver_data = 0x3000010 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x55, .steppings = 0x0040, .driver_data = 0x4003605 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x55, .steppings = 0x0080, .driver_data = 0x5003707 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x55, .steppings = 0x0800, .driver_data = 0x7002904 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x56, .steppings = 0x0004, .driver_data = 0x1c }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x56, .steppings = 0x0008, .driver_data = 0x700001c }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x56, .steppings = 0x0010, .driver_data = 0xf00001a }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x56, .steppings = 0x0020, .driver_data = 0xe000015 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x5c, .steppings = 0x0004, .driver_data = 0x14 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x5c, .steppings = 0x0200, .driver_data = 0x48 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x5c, .steppings = 0x0400, .driver_data = 0x28 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x5e, .steppings = 0x0008, .driver_data = 0xf0 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x5f, .steppings = 0x0002, .driver_data = 0x3e }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x66, .steppings = 0x0008, .driver_data = 0x2a }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x6a, .steppings = 0x0020, .driver_data = 0xc0002f0 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x6a, .steppings = 0x0040, .driver_data = 0xd0003e7 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x6c, .steppings = 0x0002, .driver_data = 0x10002b0 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x7a, .steppings = 0x0002, .driver_data = 0x42 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x7a, .steppings = 0x0100, .driver_data = 0x24 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x7e, .steppings = 0x0020, .driver_data = 0xc6 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x8a, .steppings = 0x0002, .driver_data = 0x33 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x8c, .steppings = 0x0002, .driver_data = 0xb8 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x8c, .steppings = 0x0004, .driver_data = 0x38 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x8d, .steppings = 0x0002, .driver_data = 0x52 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x8e, .steppings = 0x0200, .driver_data = 0xf6 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x8e, .steppings = 0x0400, .driver_data = 0xf6 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x8e, .steppings = 0x0800, .driver_data = 0xf6 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x8e, .steppings = 0x1000, .driver_data = 0xfc }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x8f, .steppings = 0x0100, .driver_data = 0x2c000390 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x8f, .steppings = 0x0080, .driver_data = 0x2b0005c0 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x8f, .steppings = 0x0040, .driver_data = 0x2c000390 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x8f, .steppings = 0x0020, .driver_data = 0x2c000390 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x8f, .steppings = 0x0010, .driver_data = 0x2c000390 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x96, .steppings = 0x0002, .driver_data = 0x1a }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x97, .steppings = 0x0004, .driver_data = 0x36 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x97, .steppings = 0x0020, .driver_data = 0x36 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0xbf, .steppings = 0x0004, .driver_data = 0x36 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0xbf, .steppings = 0x0020, .driver_data = 0x36 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x9a, .steppings = 0x0008, .driver_data = 0x434 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x9a, .steppings = 0x0010, .driver_data = 0x434 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x9c, .steppings = 0x0001, .driver_data = 0x24000026 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x9e, .steppings = 0x0200, .driver_data = 0xf8 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x9e, .steppings = 0x0400, .driver_data = 0xf8 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x9e, .steppings = 0x0800, .driver_data = 0xf6 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x9e, .steppings = 0x1000, .driver_data = 0xf8 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0x9e, .steppings = 0x2000, .driver_data = 0x100 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0xa5, .steppings = 0x0004, .driver_data = 0xfc }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0xa5, .steppings = 0x0008, .driver_data = 0xfc }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0xa5, .steppings = 0x0020, .driver_data = 0xfc }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0xa6, .steppings = 0x0001, .driver_data = 0xfe }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0xa6, .steppings = 0x0002, .driver_data = 0xfc }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0xa7, .steppings = 0x0002, .driver_data = 0x62 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0xaa, .steppings = 0x0010, .driver_data = 0x1f }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0xb7, .steppings = 0x0002, .driver_data = 0x12b }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0xba, .steppings = 0x0004, .driver_data = 0x4122 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0xba, .steppings = 0x0008, .driver_data = 0x4122 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0xba, .steppings = 0x0100, .driver_data = 0x4122 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0xbe, .steppings = 0x0001, .driver_data = 0x1a }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0xcf, .steppings = 0x0004, .driver_data = 0x21000230 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0xcf, .steppings = 0x0002, .driver_data = 0x21000230 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0xf,  .model = 0x00, .steppings = 0x0080, .driver_data = 0x12 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0xf,  .model = 0x00, .steppings = 0x0400, .driver_data = 0x15 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0xf,  .model = 0x01, .steppings = 0x0004, .driver_data = 0x2e }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0xf,  .model = 0x02, .steppings = 0x0010, .driver_data = 0x21 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0xf,  .model = 0x02, .steppings = 0x0020, .driver_data = 0x2c }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0xf,  .model = 0x02, .steppings = 0x0040, .driver_data = 0x10 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0xf,  .model = 0x02, .steppings = 0x0080, .driver_data = 0x39 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0xf,  .model = 0x02, .steppings = 0x0200, .driver_data = 0x2f }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0xf,  .model = 0x03, .steppings = 0x0004, .driver_data = 0xa }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0xf,  .model = 0x03, .steppings = 0x0008, .driver_data = 0xc }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0xf,  .model = 0x03, .steppings = 0x0010, .driver_data = 0x17 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0xf,  .model = 0x04, .steppings = 0x0002, .driver_data = 0x17 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0xf,  .model = 0x04, .steppings = 0x0008, .driver_data = 0x5 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0xf,  .model = 0x04, .steppings = 0x0010, .driver_data = 0x6 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0xf,  .model = 0x04, .steppings = 0x0080, .driver_data = 0x3 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0xf,  .model = 0x04, .steppings = 0x0100, .driver_data = 0xe }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0xf,  .model = 0x04, .steppings = 0x0200, .driver_data = 0x3 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0xf,  .model = 0x04, .steppings = 0x0400, .driver_data = 0x4 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0xf,  .model = 0x06, .steppings = 0x0004, .driver_data = 0xf }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0xf,  .model = 0x06, .steppings = 0x0010, .driver_data = 0x4 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0xf,  .model = 0x06, .steppings = 0x0020, .driver_data = 0x8 }
+{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0xf,  .model = 0x06, .steppings = 0x0100, .driver_data = 0x9 }
diff -puN arch/x86/kernel/cpu/bugs.c~old-ucode-0 arch/x86/kernel/cpu/bugs.c
--- a/arch/x86/kernel/cpu/bugs.c~old-ucode-0	2024-11-06 07:58:23.256734986 -0800
+++ b/arch/x86/kernel/cpu/bugs.c	2024-11-06 08:31:44.045967210 -0800
@@ -2945,6 +2945,9 @@ static ssize_t cpu_show_common(struct de
 	case X86_BUG_RFDS:
 		return rfds_show_state(buf);
 
+	case X86_BUG_OLD_MICROCODE:
+		return sysfs_emit(buf, "Vulnerable\n");
+
 	default:
 		break;
 	}
@@ -3024,6 +3027,11 @@ ssize_t cpu_show_reg_file_data_sampling(
 {
 	return cpu_show_common(dev, attr, buf, X86_BUG_RFDS);
 }
+
+ssize_t cpu_show_old_microcode(struct device *dev, struct device_attribute *attr, char *buf)
+{
+	return cpu_show_common(dev, attr, buf, X86_BUG_OLD_MICROCODE);
+}
 #endif
 
 void __warn_thunk(void)
diff -puN drivers/base/cpu.c~old-ucode-0 drivers/base/cpu.c
--- a/drivers/base/cpu.c~old-ucode-0	2024-11-06 08:26:51.505813735 -0800
+++ b/drivers/base/cpu.c	2024-11-06 08:29:56.925911521 -0800
@@ -599,6 +599,7 @@ CPU_SHOW_VULN_FALLBACK(retbleed);
 CPU_SHOW_VULN_FALLBACK(spec_rstack_overflow);
 CPU_SHOW_VULN_FALLBACK(gds);
 CPU_SHOW_VULN_FALLBACK(reg_file_data_sampling);
+CPU_SHOW_VULN_FALLBACK(old_microcode);
 
 static DEVICE_ATTR(meltdown, 0444, cpu_show_meltdown, NULL);
 static DEVICE_ATTR(spectre_v1, 0444, cpu_show_spectre_v1, NULL);
@@ -614,6 +615,7 @@ static DEVICE_ATTR(retbleed, 0444, cpu_s
 static DEVICE_ATTR(spec_rstack_overflow, 0444, cpu_show_spec_rstack_overflow, NULL);
 static DEVICE_ATTR(gather_data_sampling, 0444, cpu_show_gds, NULL);
 static DEVICE_ATTR(reg_file_data_sampling, 0444, cpu_show_reg_file_data_sampling, NULL);
+static DEVICE_ATTR(old_microcode, 0444, cpu_show_old_microcode, NULL);
 
 static struct attribute *cpu_root_vulnerabilities_attrs[] = {
 	&dev_attr_meltdown.attr,
@@ -630,6 +632,7 @@ static struct attribute *cpu_root_vulner
 	&dev_attr_spec_rstack_overflow.attr,
 	&dev_attr_gather_data_sampling.attr,
 	&dev_attr_reg_file_data_sampling.attr,
+	&dev_attr_old_microcode.attr,
 	NULL
 };
 
diff -puN include/linux/cpu.h~old-ucode-0 include/linux/cpu.h
--- a/include/linux/cpu.h~old-ucode-0	2024-11-06 08:32:44.333998355 -0800
+++ b/include/linux/cpu.h	2024-11-06 08:33:02.998007967 -0800
@@ -77,6 +77,8 @@ extern ssize_t cpu_show_gds(struct devic
 			    struct device_attribute *attr, char *buf);
 extern ssize_t cpu_show_reg_file_data_sampling(struct device *dev,
 					       struct device_attribute *attr, char *buf);
+extern ssize_t cpu_show_old_microcode(struct device *dev,
+				      struct device_attribute *attr, char *buf);
 
 extern __printf(4, 5)
 struct device *cpu_device_create(struct device *parent, void *drvdata,
_
Re: [RFC][PATCH] x86/cpu/bugs: Consider having old Intel microcode to be a vulnerability
Posted by Pawan Gupta 4 days, 6 hours ago
On Thu, Nov 07, 2024 at 09:06:30AM -0800, Dave Hansen wrote:
> 
> From: Dave Hansen <dave.hansen@linux.intel.com>
> 
> You can't practically run old microcode and consider a system secure
> these days.  So, let's call old microcode what it is: a vulnerability.

> Expose that vulnerability in a place that folks can find it:
> 
> 	/sys/devices/system/cpu/vulnerabilities/old_microcode

Sorry for playing the devil's advocate. I am wondering who is the prime
beneficiary of this change?

Roughly dividing the user base into:

1. People who get their updates from distro. As distros also provide the
   microcode, most likely, their kernel will be patched to agree that the
   microcode that they provide has latest security fixes. Effectively
   distros have the control over what the kernel reports.

2. People who get their updates from distro, but build their own kernel
   could benefit from this change. Broadly these would be CSPs/embedded
   vendors/developers etc.

   - I am assuming CSPs are well versed with the microcode updates and
     hand-pick the microcode that they want to apply. So, they may not be
     care too much about microcode being old. And majority of their users
     that run workload in a guest VM won't see the microcode version.

   - In my experience, embedded vendors generally take a very long time to
     provide updates. They could benefit from this change when they
     eventually update their kernel.

   - Expert users/developers who submit bug reports to mailing lists can
     now know that they are running old microcode, and should update their
     microcode before submitting a bug report. To me they would benefit
     the most from this change.

For this to be help category 1. users, we need blessing from distro
providers. It would be great if more and more distros provide their
agreement/feedback on this change, as they are the ones who would enforce
this change.
Re: [RFC][PATCH] x86/cpu/bugs: Consider having old Intel microcode to be a vulnerability
Posted by Dave Hansen 4 days, 5 hours ago
On 11/19/24 09:45, Pawan Gupta wrote:
> Sorry for playing the devil's advocate. I am wondering who is the prime
> beneficiary of this change?

At a very high level, it's for folks with new kernels and old microcode.

It's _very_ normal for someone to report a bug and for us upstream folks
to ask them to reproduce on the latest mainline. The moment they do
that, they get the latest microcode list. Folks don't randomly upgrade
to a new kernel for fun in production. But it's hopefully a very normal
activity for folks having problems and launching into debug.

In other words, "new kernel / old microcode" might be relatively rare,
but it still gets used at a *very* critical choke point.

I completely agree with your general sentiment that normal distro users
will get the distro-kernel-provided microcode version list _and_
distro-provided microcode files. This won't help them one bit unless the
distro makes a silly mistake, doesn't do testing, or they somehow
upgrade one package without the other.
Re: [RFC][PATCH] x86/cpu/bugs: Consider having old Intel microcode to be a vulnerability
Posted by Pawan Gupta 4 days, 4 hours ago
On Tue, Nov 19, 2024 at 10:49:21AM -0800, Dave Hansen wrote:
> On 11/19/24 09:45, Pawan Gupta wrote:
> > Sorry for playing the devil's advocate. I am wondering who is the prime
> > beneficiary of this change?
> 
> At a very high level, it's for folks with new kernels and old microcode.
> 
> It's _very_ normal for someone to report a bug and for us upstream folks
> to ask them to reproduce on the latest mainline. The moment they do
> that, they get the latest microcode list. Folks don't randomly upgrade
> to a new kernel for fun in production. But it's hopefully a very normal
> activity for folks having problems and launching into debug.

Ah, that makes sense.

> In other words, "new kernel / old microcode" might be relatively rare,
> but it still gets used at a *very* critical choke point.

Right.
Re: [RFC][PATCH] x86/cpu/bugs: Consider having old Intel microcode to be a vulnerability
Posted by Alex Murray 1 week, 4 days ago
> 
> == Microcode Revision Discussion ==
> 
> The microcode versions in the table were generated from the Intel
> microcode git repo:
>
>  	29f82f7429c ("microcode-20241029 Release")

This upstream microcode release only contained an update for a
functional issue[1] - not any fixes for security issues. So it would not
really be correct to say a machine running the previous microcode
revision is vulnerable. As such, should the table of microcode revisions
only be generated from the upstream microcode releases that contain
fixes for security issues? 

ie.

> +{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0xb7, .steppings = 0x0002, .driver_data = 0x12b }

should ideally be:

> +{ .flags = X86_CPU_ID_FLAG_ENTRY_VALID, .vendor = X86_VENDOR_INTEL, .family = 0x6,  .model = 0xb7, .steppings = 0x0002, .driver_data = 0x129 }

to correspond with the previous microcode release that contained actual
security fixes. 


[1] https://github.com/intel/Intel-Linux-Processor-Microcode-Data-Files/releases/tag/microcode-20241029
Re: [RFC][PATCH] x86/cpu/bugs: Consider having old Intel microcode to be a vulnerability
Posted by Andrew Cooper 1 week, 2 days ago
>> == Microcode Revision Discussion == The microcode versions in the
>> table were generated from the Intel microcode git repo: 29f82f7429c
>> ("microcode-20241029 Release")
> This upstream microcode release only contained an update for a
> functional issue[1] - not any fixes for security issues.

Now I can point at them, see the release notes for 8ac9378a8487
("microcode-20241112 Release").

Note how it's admitting to have fixed security issues silently in prior
drops.  If I were you, I wouldn't make assumptions based on what's not
said in the release notes.

I can count on one hand the number of drops in that repo which I know
(or reasonably suspect) to be "functional issues only", but the one you
happened to reference is a fix for "this CPU overvolts itself to an
early grave".  Many would consider this a denial of service, against
ones wallet if nothing else.

~Andrew
Re: [RFC][PATCH] x86/cpu/bugs: Consider having old Intel microcode to be a vulnerability
Posted by Dave Hansen 5 days, 4 hours ago
On 11/13/24 18:09, Andrew Cooper wrote:
> Note how it's admitting to have fixed security issues silently in prior
> drops.  If I were you, I wouldn't make assumptions based on what's not
> said in the release notes.

I've gotten two pieces of feedback. Paraphrasing, one bit of feedback
from Andrew says:

	Don't explicitly trust the release notes. Here's an active,
	super recent example of when you can't trust them.

and Alex (elsewhere in the thread) says:

	The kernel should explicitly trust the release notes (for
	security vulnerability statements at least)

I'm partial to Andrew's position here because of the real-world recent
evidence.

It also occurs to me that Intel could have done this for two reasons:
First, it could be an attempt to do coordinated disclosure when the
microcode to fix the issue is ready well in advance of an issue being
disclosed.  The second is that a human made an error and neglected to
mention the security issues in the release notes.

We can fix the first issue by asking my Intel colleagues to not do that
in  the future.  I'd be happy to do that if folks want.

But we can't fix the second issue until we have either infallible humans
(or AI).  I'm not sure either one is on the horizon. ;)
Re: [RFC][PATCH] x86/cpu/bugs: Consider having old Intel microcode to be a vulnerability
Posted by Dave Hansen 1 week, 4 days ago
On 11/11/24 22:37, Alex Murray wrote:
>> == Microcode Revision Discussion ==
>>
>> The microcode versions in the table were generated from the Intel
>> microcode git repo:
>>
>>  	29f82f7429c ("microcode-20241029 Release")
> 
> This upstream microcode release only contained an update for a
> functional issue[1] - not any fixes for security issues. So it would not
> really be correct to say a machine running the previous microcode
> revision is vulnerable. 

There are literally two things this patch "says".  One is in userspace
and can be literally read as:

	/sys/devices/system/cpu/vulnerabilities/old_microcode

"You are vulnerable to old CPU microcode".

The other is in the code: X86_BUG_OLD_MICROCODE.  Which can literally be
read to say "you have a CPU bug called 'old microcode'. (Oh, and I guess
this comes out in /proc/cpuinfo too).

If you think this is confusing, we can document our way out of it or
revise the changelog.  But we kinda get to define what the file and the
X86_BUG mean in the first place.

I don't really see how it's possible to argue that they're "incorrect".

> As such, should the table of microcode revisions only be generated
> from the upstream microcode releases that contain fixes for security
> issues?

No, I don't think so. First, I honestly don't want to have this
discussion every three months where folks can argue about whether a
given microcode release is functional or security.  Or, even worse,
which individual microcode *image* is which.

Second, running kernels with functional issues is *BAD*.  As a kernel
policy, we don't want users running with old microcode. Security bugs
only hurt our users but functional bugs hurt the kernel too because
users blame the kernel when they hit them and kernel developers spend
time chasing those issues down.

So I guess it boils down to: First, should we tell users when their
microcode is old?  If so, how should we do it?
Re: [RFC][PATCH] x86/cpu/bugs: Consider having old Intel microcode to be a vulnerability
Posted by Nikolay Borisov 1 week, 3 days ago

On 12.11.24 г. 17:51 ч., Dave Hansen wrote:
> On 11/11/24 22:37, Alex Murray wrote:
>>> == Microcode Revision Discussion ==
>>>
>>> The microcode versions in the table were generated from the Intel
>>> microcode git repo:
>>>
>>>   	29f82f7429c ("microcode-20241029 Release")
>>
>> This upstream microcode release only contained an update for a
>> functional issue[1] - not any fixes for security issues. So it would not
>> really be correct to say a machine running the previous microcode
>> revision is vulnerable.
> 
> There are literally two things this patch "says".  One is in userspace
> and can be literally read as:
> 
> 	/sys/devices/system/cpu/vulnerabilities/old_microcode
> 
> "You are vulnerable to old CPU microcode".
> 
> The other is in the code: X86_BUG_OLD_MICROCODE.  Which can literally be
> read to say "you have a CPU bug called 'old microcode'. (Oh, and I guess
> this comes out in /proc/cpuinfo too).
> 
> If you think this is confusing, we can document our way out of it or
> revise the changelog.  But we kinda get to define what the file and the
> X86_BUG mean in the first place.
> 
> I don't really see how it's possible to argue that they're "incorrect".
> 
>> As such, should the table of microcode revisions only be generated
>> from the upstream microcode releases that contain fixes for security
>> issues?
> 
> No, I don't think so. First, I honestly don't want to have this
> discussion every three months where folks can argue about whether a
> given microcode release is functional or security.  Or, even worse,
> which individual microcode *image* is which.
> 
> Second, running kernels with functional issues is *BAD*.  As a kernel
> policy, we don't want users running with old microcode. Security bugs
> only hurt our users but functional bugs hurt the kernel too because
> users blame the kernel when they hit them and kernel developers spend
> time chasing those issues down.

<Perhaps offtopic>

Probably the same reasoning can be applied here as for the CVEs - since 
the kernel (microcode) is a very fundamental piece of software, almost 
any issue can be treated as a security one (at least judging from the 
influx of automatically generated CVEs). By the same token we can assume 
that microcode always fixes a critical issue :)

</Perhaps offtopic>

> 
> So I guess it boils down to: First, should we tell users when their
> microcode is old?  If so, how should we do it?
> 

Re: [RFC][PATCH] x86/cpu/bugs: Consider having old Intel microcode to be a vulnerability
Posted by Alex Murray 1 week, 3 days ago
On Tue, 2024-11-12 at 07:51:38 -0800, Dave Hansen wrote:

> On 11/11/24 22:37, Alex Murray wrote:
>>> == Microcode Revision Discussion ==
>>>
>>> The microcode versions in the table were generated from the Intel
>>> microcode git repo:
>>>
>>>  	29f82f7429c ("microcode-20241029 Release")
>> 
>> This upstream microcode release only contained an update for a
>> functional issue[1] - not any fixes for security issues. So it would not
>> really be correct to say a machine running the previous microcode
>> revision is vulnerable. 
>
> There are literally two things this patch "says".  One is in userspace
> and can be literally read as:
>
> 	/sys/devices/system/cpu/vulnerabilities/old_microcode
>
> "You are vulnerable to old CPU microcode".
>
> The other is in the code: X86_BUG_OLD_MICROCODE.  Which can literally be
> read to say "you have a CPU bug called 'old microcode'. (Oh, and I guess
> this comes out in /proc/cpuinfo too).
>
> If you think this is confusing, we can document our way out of it or
> revise the changelog.  But we kinda get to define what the file and the
> X86_BUG mean in the first place.
>
> I don't really see how it's possible to argue that they're
> "incorrect".

My point is that if a given microcode contains only functional updates,
then if you are *not* using it you do not have a security
vulnerability. If however the specified microcode revision fixes a known
security issue then yes I agree, there is a vulnerability and if you are
not using this microcode revision you are vulnerable to it. It is really
the distinction between a microcode update that is purely for functional
issues compared to one that is for security issues as well.

>
>> As such, should the table of microcode revisions only be generated
>> from the upstream microcode releases that contain fixes for security
>> issues?
>
> No, I don't think so. First, I honestly don't want to have this
> discussion every three months where folks can argue about whether a
> given microcode release is functional or security.  Or, even worse,
> which individual microcode *image* is which.

I don't think there is an argument here - releases at
https://github.com/intel/Intel-Linux-Processor-Microcode-Data-Files
clearly say if they contain Security updates or updates for functional
issues - so if a release like the previous 20241029 one only contains an
update for functional issues it should not be treated as a security
issue if a system is not running it.

>
> Second, running kernels with functional issues is *BAD*.  As a kernel
> policy, we don't want users running with old microcode. Security bugs
> only hurt our users but functional bugs hurt the kernel too because
> users blame the kernel when they hit them and kernel developers spend
> time chasing those issues down.
>

But just because something is bad that doesn't mean it is a security
vulnerability. One option could be to taint the kernel in this case
instead.

> So I guess it boils down to: First, should we tell users when their
> microcode is old?  If so, how should we do it?

So I suggest instead if you really want to flag old microcode as an
issue you could taint it as such since the description of tainted is

> The kernel will mark itself as ‘tainted’ when something occurs that
> might be relevant later when investigating problems

which feels like exactly the kind of semantics you describe above.

Then if you also want to surface old microcode that also is missing
security fixes you could then use your new proposed mechanism.
Re: [RFC][PATCH] x86/cpu/bugs: Consider having old Intel microcode to be a vulnerability
Posted by Dave Hansen 1 week, 3 days ago
On 11/12/24 19:29, Alex Murray wrote:
>> No, I don't think so. First, I honestly don't want to have this
>> discussion every three months where folks can argue about whether a
>> given microcode release is functional or security.  Or, even worse,
>> which individual microcode *image* is which.
> I don't think there is an argument here - releases at
> https://github.com/intel/Intel-Linux-Processor-Microcode-Data-Files
> clearly say if they contain Security updates or updates for functional
> issues - so if a release like the previous 20241029 one only contains an
> update for functional issues it should not be treated as a security
> issue if a system is not running it.

While I applaud your trust in my employer, I don't see quite as bright
of a line between security and functional problems.

Here's the bottom line: I agree that setting a taint flag for old
microcode seems like a good idea. But I also think that there's enough
of a "vulnerability" (security or otherwise) to justify placing
"old_microcode" alongside the CPU security vulnerabilities that have
known exploits.

I'm lazy and don't want to read and filter the microcode changelogs.  I
also don't want to have to trust my colleagues to precisely agree on
where that line is between a security and functional problem.

So I'm leaning toward setting:

	TAINT_CPU_OUT_OF_SPEC
plus
	X86_BUG_OLD_MICROCODE

and calling it a day.
Re: [RFC][PATCH] x86/cpu/bugs: Consider having old Intel microcode to be a vulnerability
Posted by Alex Murray 1 week, 3 days ago
On Wed, 2024-11-13 at 08:00:26 -0800, Dave Hansen wrote:

> While I applaud your trust in my employer, I don't see quite as bright
> of a line between security and functional problems.
>
> Here's the bottom line: I agree that setting a taint flag for old
> microcode seems like a good idea. But I also think that there's enough
> of a "vulnerability" (security or otherwise) to justify placing
> "old_microcode" alongside the CPU security vulnerabilities that have
> known exploits.
>
> I'm lazy and don't want to read and filter the microcode changelogs.  I
> also don't want to have to trust my colleagues to precisely agree on
> where that line is between a security and functional problem.
>

The only other data point then to mention is that all the major distros
(Debian[1], Ubuntu[2] and Fedora[3]) are still only shipping the previous
security update release (20240910) in their stable releases - *not* the
more recent release with the functional updates in 20241029 - in which
case anyone running a current stable release would then show as being
"vulnerable". I can't speak for the other distros, but for Ubuntu we
generally only ship things which are called out as specific security
fixes in our security updates *and* we generally prioritise security
updates over bug fixes (which these 'functional' updates appear be
rather than fixing actual exploitable security issues).

> So I'm leaning toward setting:
>
> 	TAINT_CPU_OUT_OF_SPEC
> plus
> 	X86_BUG_OLD_MICROCODE
>
> and calling it a day.

Does this mean you are thinking of dropping the userspace entry in the
cpu vulnerablities sysfs tree? If so then I am not so concerned, since
my primary concern is having something which looks scary to
users/sysadmins ("your CPU has an unpatched vulnerablity") which they
can't do anything about since their distribution has a different
definition of what counts as a security update compared to the upstream
kernel maintainers. If the sysfs entry is dropped then this is not so
visible to end-users and hence there is less panic.


[1] https://packages.debian.org/search?keywords=intel-microcode
[2] https://launchpad.net/ubuntu/+source/intel-microcode
[3] https://packages.fedoraproject.org/pkgs/microcode_ctl/microcode_ctl/fedora-41.html
Re: [RFC][PATCH] x86/cpu/bugs: Consider having old Intel microcode to be a vulnerability
Posted by Dave Hansen 1 week, 2 days ago
On 11/13/24 15:58, Alex Murray wrote:
...
> The only other data point then to mention is that all the major distros
> (Debian[1], Ubuntu[2] and Fedora[3]) are still only shipping the previous
> security update release (20240910) in their stable releases - *not* the
> more recent release with the functional updates in 20241029 - in which
> case anyone running a current stable release would then show as being
> "vulnerable". I can't speak for the other distros, but for Ubuntu we
> generally only ship things which are called out as specific security
> fixes in our security updates *and* we generally prioritise security
> updates over bug fixes (which these 'functional' updates appear be
> rather than fixing actual exploitable security issues).

That's a very important data point. Thanks for that.

Like I said in the original changelog, I'm open to relaxing things to
define old to allow folks to be a release or two behind. But I'd want to
hear a lot more about _why_ the distros lag. I'd probably also have some
chats to see what other folks at Intel think about it.

So what would you propose the rules be?  Are you suggesting that we go
through the microcode changelogs for each CPU for each release and only
update the "old" revisions for security issues?  If there were only
functional issues fixed for, say, 2 years, on a CPU would the "old"
version get updated?

>> So I'm leaning toward setting:
>>
>> 	TAINT_CPU_OUT_OF_SPEC
>> plus
>> 	X86_BUG_OLD_MICROCODE
>>
>> and calling it a day.
> 
> Does this mean you are thinking of dropping the userspace entry in the
> cpu vulnerablities sysfs tree? 

No, I plan to keep X86_BUG_OLD_MICROCODE and the corresponding sysfs entry.

> If so then I am not so concerned, since my primary concern is having
> something which looks scary to users/sysadmins ("your CPU has an
> unpatched vulnerablity") which they can't do anything about since
> their distribution has a different definition of what counts as a
> security update compared to the upstream kernel maintainers. If the
> sysfs entry is dropped then this is not so visible to end-users and
> hence there is less panic.

Right, we don't want to unnecessarily scare anyone.

But if a distro is being too slow in getting microcode out, then it
would be good to inform users about known functional or security gaps
they're exposed to.

That's the thing we need to focus on.  Not: "Can users do anything about
it?" Rather: "What's best for the users?"
Re: [RFC][PATCH] x86/cpu/bugs: Consider having old Intel microcode to be a vulnerability
Posted by Alex Murray 5 days, 15 hours ago
On Wed, 2024-11-13 at 16:37:31 -0800, Dave Hansen wrote:
> On 11/13/24 15:58, Alex Murray wrote:
> ...
>> The only other data point then to mention is that all the major distros
>> (Debian[1], Ubuntu[2] and Fedora[3]) are still only shipping the previous
>> security update release (20240910) in their stable releases - *not* the
>> more recent release with the functional updates in 20241029 - in which
>> case anyone running a current stable release would then show as being
>> "vulnerable". I can't speak for the other distros, but for Ubuntu we
>> generally only ship things which are called out as specific security
>> fixes in our security updates *and* we generally prioritise security
>> updates over bug fixes (which these 'functional' updates appear be
>> rather than fixing actual exploitable security issues).
>
> That's a very important data point. Thanks for that.
>
> Like I said in the original changelog, I'm open to relaxing things to
> define old to allow folks to be a release or two behind. But I'd want to
> hear a lot more about _why_ the distros lag. I'd probably also have some
> chats to see what other folks at Intel think about it.

Again, I can't speak for other distros but for Ubuntu see my comment
above re prioritising security vs functional updates.

>
> So what would you propose the rules be?  Are you suggesting that we go
> through the microcode changelogs for each CPU for each release and only
> update the "old" revisions for security issues?  If there were only
> functional issues fixed for, say, 2 years, on a CPU would the "old"
> version get updated?

For calling out old microcode as a vulnerability, yes I would prefer
that only releases which your colleagues state as fixing security issues
get included. However, for the tainted case, anything older than the
current release would make sense. In which case you would have to
maintain two different revision IDs per MCU - one which is the latest,
and the other which is the latest with a security fix for a given
platform. From my experience though it is a more rare occasion that a
new upstream microcode release does not contain some security fixes. So
perhaps this distinction will be mostly irrelevant in practice assuming
most all MCU releases contain a fix for some security issue.

>
>>> So I'm leaning toward setting:
>>>
>>> 	TAINT_CPU_OUT_OF_SPEC
>>> plus
>>> 	X86_BUG_OLD_MICROCODE
>>>
>>> and calling it a day.
>> 
>> Does this mean you are thinking of dropping the userspace entry in the
>> cpu vulnerablities sysfs tree? 
>
> No, I plan to keep X86_BUG_OLD_MICROCODE and the corresponding sysfs entry.
>
>> If so then I am not so concerned, since my primary concern is having
>> something which looks scary to users/sysadmins ("your CPU has an
>> unpatched vulnerablity") which they can't do anything about since
>> their distribution has a different definition of what counts as a
>> security update compared to the upstream kernel maintainers. If the
>> sysfs entry is dropped then this is not so visible to end-users and
>> hence there is less panic.
>
> Right, we don't want to unnecessarily scare anyone.
>
> But if a distro is being too slow in getting microcode out, then it
> would be good to inform users about known functional or security gaps
> they're exposed to.
>
> That's the thing we need to focus on.  Not: "Can users do anything about
> it?" Rather: "What's best for the users?"

Yep I agree, end-users should always be the primary concern especially
for new user visible things like new entries in the vulnerabilities
sysfs tree. Also I am not averse to calling out the situation of running
an out-of-date microcode *which has known security issues* as a
vulnerability, I think providing more data to users to help them make
the best assessment of any given risk is always a good thing, but we
just need to be mindful to do it in a way that is hopefully actionable
as well.
Re: [RFC][PATCH] x86/cpu/bugs: Consider having old Intel microcode to be a vulnerability
Posted by Andrew Cooper 2 weeks, 1 day ago
> You can't practically run old microcode and consider a system secure
> these days.  So, let's call old microcode what it is: a vulnerability.

The list becomes stale 4 times a year, so you need to identify when it's
out of date, and whatever that something is has to be strong enough to
cause distros to backport too.  Perhaps a date in the header, so you can
at least report "status vulnerable, metadata out of date".

Also, you want to identify EOL CPUs.  Just because they're on the most
recent published ucode doesn't mean they're not vulnerable.

Under some hypervisors, you get fed the revision 0x7fffffff.  Others
might tell you the truth, or it may be the truth from when you booted. 
For this, probably best to say "consult your hypervisor".

Failure to publish information, or not publishing fixes for in-support
parts should be considered a vulnerability.  (*ahem*, AMD)

Or you could just simplify the whole path to "yes".  It's true, even if
people don't know.

I really want to like this, but it's a giant can of worms, with as many
political challenges as technical.

~Andrew

P.S. I do like that you've labelled debug microcode as vulnerable.  It's
just software in a different form factor, and we know how buggy software
generally is.
Re: [RFC][PATCH] x86/cpu/bugs: Consider having old Intel microcode to be a vulnerability
Posted by Dave Hansen 1 week, 4 days ago
On 11/8/24 15:36, Andrew Cooper wrote:
>> You can't practically run old microcode and consider a system secure
>> these days.  So, let's call old microcode what it is: a vulnerability.
> 
> The list becomes stale 4 times a year, so you need to identify when it's
> out of date, and whatever that something is has to be strong enough to
> cause distros to backport too.  Perhaps a date in the header, so you can
> at least report "status vulnerable, metadata out of date".

I don't want to get too fancy about this.  I'm assuming that mainline
and the stable kernels will be regularly fed new metadata.  The only way
to have out-of-date metadata should be by running an out-of-date kernel
in which case you have bigger problems on your hands.

> Also, you want to identify EOL CPUs.  Just because they're on the most
> recent published ucode doesn't mean they're not vulnerable.

That's a good idea too.  But I think it deserves a separate discussion
and separate patch.

> Under some hypervisors, you get fed the revision 0x7fffffff.  Others
> might tell you the truth, or it may be the truth from when you booted. 
> For this, probably best to say "consult your hypervisor".

Good point.  We should probably just say "unknown" when running as a
guest, or just not have the sysfs file at all.

> Failure to publish information, or not publishing fixes for in-support
> parts should be considered a vulnerability.  (*ahem*, AMD)
> 
> Or you could just simplify the whole path to "yes".  It's true, even if
> people don't know.

This series answers the question:

	Has the vendor published a newer OS-loadable microcode than you
	are running right now?

It doesn't seek to answer the question:

	Is the microcode that you are running right now vulnerable to
	anything (that the kernel knows about)?

I think the first question is quite answerable in a pretty factual way.
The second question is much hardware.  It's worth answering for sure ...
with another patch. :)