[PATCH 1/2] x86/rdrand: implement sanity check for RDSEED

Mikhail Paulyshka posted 2 patches 9 months, 1 week ago
[PATCH 1/2] x86/rdrand: implement sanity check for RDSEED
Posted by Mikhail Paulyshka 9 months, 1 week ago
On AMD Cyan Skillfish (Family 0x17 Model 0x47 Stepping 0x0) there is
a situation where RDRAND works perfectly but RDSEED generates FF's

Performs a separate check for RDRAND and RDSEED as their behavior
may be different.

Signed-off-by: Mikhail Paulyshka <me@mixaill.net>
---
 arch/x86/include/asm/archrandom.h |  1 +
 arch/x86/kernel/cpu/common.c      |  1 +
 arch/x86/kernel/cpu/rdrand.c      | 43 ++++++++++++++++++++++++++++---
 3 files changed, 42 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/archrandom.h b/arch/x86/include/asm/archrandom.h
index 02bae8e0758b..62ffc8983700 100644
--- a/arch/x86/include/asm/archrandom.h
+++ b/arch/x86/include/asm/archrandom.h
@@ -57,6 +57,7 @@ static inline size_t __must_check arch_get_random_seed_longs(unsigned long *v, s
 
 #ifndef CONFIG_UML
 void x86_init_rdrand(struct cpuinfo_x86 *c);
+void x86_init_rdseed(struct cpuinfo_x86 *c);
 #endif
 
 #endif /* ASM_X86_ARCHRANDOM_H */
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 7cce91b19fb2..277781863210 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1883,6 +1883,7 @@ static void identify_cpu(struct cpuinfo_x86 *c)
 	}
 
 	x86_init_rdrand(c);
+	x86_init_rdseed(c);
 	setup_pku(c);
 	setup_cet(c);
 
diff --git a/arch/x86/kernel/cpu/rdrand.c b/arch/x86/kernel/cpu/rdrand.c
index eeac00d20926..e9f7ef5dfe25 100644
--- a/arch/x86/kernel/cpu/rdrand.c
+++ b/arch/x86/kernel/cpu/rdrand.c
@@ -12,18 +12,20 @@
 #include <asm/archrandom.h>
 #include <asm/sections.h>
 
+
+enum { SAMPLES = 8, MIN_CHANGE = 5 };
+
 /*
  * RDRAND has Built-In-Self-Test (BIST) that runs on every invocation.
  * Run the instruction a few times as a sanity check. Also make sure
  * it's not outputting the same value over and over, which has happened
  * as a result of past CPU bugs.
  *
- * If it fails, it is simple to disable RDRAND and RDSEED here.
+ * If it fails, it is simple to disable RDRAND here.
  */
 
 void x86_init_rdrand(struct cpuinfo_x86 *c)
 {
-	enum { SAMPLES = 8, MIN_CHANGE = 5 };
 	unsigned long sample, prev;
 	bool failure = false;
 	size_t i, changed;
@@ -44,7 +46,42 @@ void x86_init_rdrand(struct cpuinfo_x86 *c)
 
 	if (failure) {
 		clear_cpu_cap(c, X86_FEATURE_RDRAND);
-		clear_cpu_cap(c, X86_FEATURE_RDSEED);
 		pr_emerg("RDRAND is not reliable on this platform; disabling.\n");
 	}
 }
+
+
+/*
+ * RDSEED has Built-In-Self-Test (BIST) that runs on every invocation.
+ * Run the instruction a few times as a sanity check. Also make sure
+ * it's not outputting the same value over and over, which has happened
+ * as a result of past CPU bugs.
+ *
+ * If it fails, it is simple to disable RDSEED here.
+ */
+
+void x86_init_rdseed(struct cpuinfo_x86 *c)
+{
+	unsigned long sample, prev;
+	bool failure = false;
+	size_t i, changed;
+
+	if (!cpu_has(c, X86_FEATURE_RDSEED))
+		return;
+
+	for (changed = 0, i = 0; i < SAMPLES; ++i) {
+		if (!rdseed_long(&sample)) {
+			failure = true;
+			break;
+		}
+		changed += i && sample != prev;
+		prev = sample;
+	}
+	if (changed < MIN_CHANGE)
+		failure = true;
+
+	if (failure) {
+		clear_cpu_cap(c, X86_FEATURE_RDSEED);
+		pr_emerg("RDSEED is not reliable on this platform; disabling.\n");
+	}
+}
-- 
2.48.1
Re: [PATCH 1/2] x86/rdrand: implement sanity check for RDSEED
Posted by kernel test robot 9 months ago

Hello,

kernel test robot noticed "WARNING:at_arch/x86/kernel/cpu/cpuid-deps.c:#do_clear_cpu_cap" on:

commit: 1a98daa004bca11b293ae344384b120c1f3560eb ("[PATCH 1/2] x86/rdrand: implement sanity check for RDSEED")
url: https://github.com/intel-lab-lkp/linux/commits/Mikhail-Paulyshka/x86-rdrand-implement-sanity-check-for-RDSEED/20250312-204319
base: https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git 6d536cad0d55e71442b6d65500f74eb85544269e
patch link: https://lore.kernel.org/all/20250312123130.8290-2-me@mixaill.net/
patch subject: [PATCH 1/2] x86/rdrand: implement sanity check for RDSEED

in testcase: igt
version: igt-x86_64-534d75199-1_20250316
with following parameters:

	group: group-23



config: x86_64-rhel-9.4-func
compiler: gcc-12
test machine: 20 threads 1 sockets (Commet Lake) with 16G memory

(please refer to attached dmesg/kmsg for entire log/backtrace)



If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202503211421.fc83271a-lkp@intel.com


[    0.995938][    T0] ------------[ cut here ]------------
[ 0.995938][ T0] WARNING: CPU: 4 PID: 0 at arch/x86/kernel/cpu/cpuid-deps.c:118 do_clear_cpu_cap (arch/x86/kernel/cpu/cpuid-deps.c:118 (discriminator 1)) 
[    0.995938][    T0] Modules linked in:
[    0.995938][    T0] CPU: 4 UID: 0 PID: 0 Comm: swapper/4 Not tainted 6.14.0-rc5-00157-g1a98daa004bc #1
[ 0.995938][ T0] RIP: 0010:do_clear_cpu_cap (arch/x86/kernel/cpu/cpuid-deps.c:118 (discriminator 1)) 
[ 0.995938][ T0] Code: 89 c1 83 e0 07 48 c1 e9 03 83 c0 03 0f b6 14 11 38 d0 7c 08 84 d2 0f 85 a8 00 00 00 8b 15 d1 f2 01 05 85 d2 0f 84 ee fd ff ff <0f> 0b e9 e7 fd ff ff 48 c7 c7 40 c3 33 86 e8 4e fd ff ff 49 8d bf
All code
========
   0:	89 c1                	mov    %eax,%ecx
   2:	83 e0 07             	and    $0x7,%eax
   5:	48 c1 e9 03          	shr    $0x3,%rcx
   9:	83 c0 03             	add    $0x3,%eax
   c:	0f b6 14 11          	movzbl (%rcx,%rdx,1),%edx
  10:	38 d0                	cmp    %dl,%al
  12:	7c 08                	jl     0x1c
  14:	84 d2                	test   %dl,%dl
  16:	0f 85 a8 00 00 00    	jne    0xc4
  1c:	8b 15 d1 f2 01 05    	mov    0x501f2d1(%rip),%edx        # 0x501f2f3
  22:	85 d2                	test   %edx,%edx
  24:	0f 84 ee fd ff ff    	je     0xfffffffffffffe18
  2a:*	0f 0b                	ud2		<-- trapping instruction
  2c:	e9 e7 fd ff ff       	jmp    0xfffffffffffffe18
  31:	48 c7 c7 40 c3 33 86 	mov    $0xffffffff8633c340,%rdi
  38:	e8 4e fd ff ff       	call   0xfffffffffffffd8b
  3d:	49                   	rex.WB
  3e:	8d                   	.byte 0x8d
  3f:	bf                   	.byte 0xbf

Code starting with the faulting instruction
===========================================
   0:	0f 0b                	ud2
   2:	e9 e7 fd ff ff       	jmp    0xfffffffffffffdee
   7:	48 c7 c7 40 c3 33 86 	mov    $0xffffffff8633c340,%rdi
   e:	e8 4e fd ff ff       	call   0xfffffffffffffd61
  13:	49                   	rex.WB
  14:	8d                   	.byte 0x8d
  15:	bf                   	.byte 0xbf
[    0.995938][    T0] RSP: 0000:ffffc90000237d20 EFLAGS: 00010002
[    0.995938][    T0] RAX: 0000000000000003 RBX: ff656fe452696248 RCX: 1ffffffff0c67895
[    0.995938][    T0] RDX: 0000000000000001 RSI: 0000000000000132 RDI: ffff8883e2e28060
[    0.995938][    T0] RBP: ffff8883e2e28060 R08: ffff8883e2e28060 R09: fffffbfff0ef9094
[    0.995938][    T0] R10: ffffffff877c84a3 R11: 0000000000000001 R12: 0000000000000132
[    0.995938][    T0] R13: ffffffff873f8440 R14: ffffffff873f83c0 R15: ffff8883e2e28090
[    0.995938][    T0] FS:  0000000000000000(0000) GS:ffff88845ba72000(0000) knlGS:0000000000000000
[    0.995938][    T0] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.995938][    T0] CR2: 0000000000000000 CR3: 000000045b26c001 CR4: 00000000003706b0
[    0.995938][    T0] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    0.995938][    T0] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[    0.995938][    T0] Call Trace:
[    0.995938][    T0]  <TASK>
[ 0.995938][ T0] ? __warn (kernel/panic.c:748) 
[ 0.995938][ T0] ? do_clear_cpu_cap (arch/x86/kernel/cpu/cpuid-deps.c:118 (discriminator 1)) 
[ 0.995938][ T0] ? report_bug (lib/bug.c:180 lib/bug.c:219) 
[ 0.995938][ T0] ? do_clear_cpu_cap (arch/x86/kernel/cpu/cpuid-deps.c:118 (discriminator 1)) 
[ 0.995938][ T0] ? handle_bug (arch/x86/kernel/traps.c:337) 
[ 0.995938][ T0] ? exc_invalid_op (arch/x86/kernel/traps.c:391 (discriminator 1)) 
[ 0.995938][ T0] ? asm_exc_invalid_op (arch/x86/include/asm/idtentry.h:574) 
[ 0.995938][ T0] ? do_clear_cpu_cap (arch/x86/kernel/cpu/cpuid-deps.c:118 (discriminator 1)) 
[ 0.995938][ T0] ? __pfx_do_clear_cpu_cap (arch/x86/kernel/cpu/cpuid-deps.c:109) 
[ 0.995938][ T0] ? init_ia32_feat_ctl (arch/x86/kernel/cpu/feat_ctl.c:189) 
[ 0.995938][ T0] x86_init_rdseed (arch/x86/kernel/cpu/rdrand.c:85) 
[ 0.995938][ T0] identify_cpu (arch/x86/kernel/cpu/common.c:519 arch/x86/kernel/cpu/common.c:1922) 
[ 0.995938][ T0] identify_secondary_cpu (arch/x86/kernel/cpu/common.c:2014) 
[ 0.995938][ T0] start_secondary (arch/x86/kernel/smpboot.c:199 arch/x86/kernel/smpboot.c:283) 
[ 0.995938][ T0] ? __pfx_start_secondary (arch/x86/kernel/smpboot.c:233) 
[ 0.995938][ T0] ? startup_64_load_idt (arch/x86/include/asm/desc.h:215 arch/x86/kernel/head64.c:549) 
[ 0.995938][ T0] common_startup_64 (arch/x86/kernel/head_64.S:419) 
[    0.995938][    T0]  </TASK>
[    0.995938][    T0] ---[ end trace 0000000000000000 ]---
[    0.995938][    T0] RDSEED is not reliable on this platform; disabling.
[    0.995938][    T0] Masked ExtINT on CPU#5
[    0.995938][    T0] Masked ExtINT on CPU#6
[    0.995938][    T0] RDSEED is not reliable on this platform; disabling.
[    0.995938][    T0] Masked ExtINT on CPU#7
[    0.995938][    T0] RDSEED is not reliable on this platform; disabling.
[    0.995938][    T0] Masked ExtINT on CPU#8
[    0.995938][    T0] RDSEED is not reliable on this platform; disabling.
[    0.995938][    T0] Masked ExtINT on CPU#9
[    0.995938][    T0] RDSEED is not reliable on this platform; disabling.


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250321/202503211421.fc83271a-lkp@intel.com



-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Re: [PATCH 1/2] x86/rdrand: implement sanity check for RDSEED
Posted by Borislav Petkov 9 months, 1 week ago
On Wed, Mar 12, 2025 at 03:31:29PM +0300, Mikhail Paulyshka wrote:
> +/*
> + * RDSEED has Built-In-Self-Test (BIST) that runs on every invocation.
> + * Run the instruction a few times as a sanity check. Also make sure
> + * it's not outputting the same value over and over, which has happened
> + * as a result of past CPU bugs.
> + *
> + * If it fails, it is simple to disable RDSEED here.
> + */
> +
> +void x86_init_rdseed(struct cpuinfo_x86 *c)
> +{
> +	unsigned long sample, prev;
> +	bool failure = false;
> +	size_t i, changed;
> +
> +	if (!cpu_has(c, X86_FEATURE_RDSEED))
> +		return;
> +
> +	for (changed = 0, i = 0; i < SAMPLES; ++i) {
> +		if (!rdseed_long(&sample)) {
> +			failure = true;
> +			break;
> +		}
> +		changed += i && sample != prev;
> +		prev = sample;
> +	}
> +	if (changed < MIN_CHANGE)
> +		failure = true;
> +
> +	if (failure) {
> +		clear_cpu_cap(c, X86_FEATURE_RDSEED);
> +		pr_emerg("RDSEED is not reliable on this platform; disabling.\n");
> +	}
> +}

This one basically duplicates x86_init_rdrand() and I'm sure you can use
a single function to test both.

But more importantly, lemme ask around internally whether that is even
a reliable test to detect RDSEED performs properly or not.

Stay tuned...

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette