On AMD Cyan Skillfish (Family 0x17 Model 0x47 Stepping 0x0) there is
a situation where RDRAND works perfectly but RDSEED generates FF's
Performs a separate check for RDRAND and RDSEED as their behavior
may be different.
Signed-off-by: Mikhail Paulyshka <me@mixaill.net>
---
arch/x86/include/asm/archrandom.h | 1 +
arch/x86/kernel/cpu/common.c | 1 +
arch/x86/kernel/cpu/rdrand.c | 43 ++++++++++++++++++++++++++++---
3 files changed, 42 insertions(+), 3 deletions(-)
diff --git a/arch/x86/include/asm/archrandom.h b/arch/x86/include/asm/archrandom.h
index 02bae8e0758b..62ffc8983700 100644
--- a/arch/x86/include/asm/archrandom.h
+++ b/arch/x86/include/asm/archrandom.h
@@ -57,6 +57,7 @@ static inline size_t __must_check arch_get_random_seed_longs(unsigned long *v, s
#ifndef CONFIG_UML
void x86_init_rdrand(struct cpuinfo_x86 *c);
+void x86_init_rdseed(struct cpuinfo_x86 *c);
#endif
#endif /* ASM_X86_ARCHRANDOM_H */
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 7cce91b19fb2..277781863210 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1883,6 +1883,7 @@ static void identify_cpu(struct cpuinfo_x86 *c)
}
x86_init_rdrand(c);
+ x86_init_rdseed(c);
setup_pku(c);
setup_cet(c);
diff --git a/arch/x86/kernel/cpu/rdrand.c b/arch/x86/kernel/cpu/rdrand.c
index eeac00d20926..e9f7ef5dfe25 100644
--- a/arch/x86/kernel/cpu/rdrand.c
+++ b/arch/x86/kernel/cpu/rdrand.c
@@ -12,18 +12,20 @@
#include <asm/archrandom.h>
#include <asm/sections.h>
+
+enum { SAMPLES = 8, MIN_CHANGE = 5 };
+
/*
* RDRAND has Built-In-Self-Test (BIST) that runs on every invocation.
* Run the instruction a few times as a sanity check. Also make sure
* it's not outputting the same value over and over, which has happened
* as a result of past CPU bugs.
*
- * If it fails, it is simple to disable RDRAND and RDSEED here.
+ * If it fails, it is simple to disable RDRAND here.
*/
void x86_init_rdrand(struct cpuinfo_x86 *c)
{
- enum { SAMPLES = 8, MIN_CHANGE = 5 };
unsigned long sample, prev;
bool failure = false;
size_t i, changed;
@@ -44,7 +46,42 @@ void x86_init_rdrand(struct cpuinfo_x86 *c)
if (failure) {
clear_cpu_cap(c, X86_FEATURE_RDRAND);
- clear_cpu_cap(c, X86_FEATURE_RDSEED);
pr_emerg("RDRAND is not reliable on this platform; disabling.\n");
}
}
+
+
+/*
+ * RDSEED has Built-In-Self-Test (BIST) that runs on every invocation.
+ * Run the instruction a few times as a sanity check. Also make sure
+ * it's not outputting the same value over and over, which has happened
+ * as a result of past CPU bugs.
+ *
+ * If it fails, it is simple to disable RDSEED here.
+ */
+
+void x86_init_rdseed(struct cpuinfo_x86 *c)
+{
+ unsigned long sample, prev;
+ bool failure = false;
+ size_t i, changed;
+
+ if (!cpu_has(c, X86_FEATURE_RDSEED))
+ return;
+
+ for (changed = 0, i = 0; i < SAMPLES; ++i) {
+ if (!rdseed_long(&sample)) {
+ failure = true;
+ break;
+ }
+ changed += i && sample != prev;
+ prev = sample;
+ }
+ if (changed < MIN_CHANGE)
+ failure = true;
+
+ if (failure) {
+ clear_cpu_cap(c, X86_FEATURE_RDSEED);
+ pr_emerg("RDSEED is not reliable on this platform; disabling.\n");
+ }
+}
--
2.48.1
Hello,
kernel test robot noticed "WARNING:at_arch/x86/kernel/cpu/cpuid-deps.c:#do_clear_cpu_cap" on:
commit: 1a98daa004bca11b293ae344384b120c1f3560eb ("[PATCH 1/2] x86/rdrand: implement sanity check for RDSEED")
url: https://github.com/intel-lab-lkp/linux/commits/Mikhail-Paulyshka/x86-rdrand-implement-sanity-check-for-RDSEED/20250312-204319
base: https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git 6d536cad0d55e71442b6d65500f74eb85544269e
patch link: https://lore.kernel.org/all/20250312123130.8290-2-me@mixaill.net/
patch subject: [PATCH 1/2] x86/rdrand: implement sanity check for RDSEED
in testcase: igt
version: igt-x86_64-534d75199-1_20250316
with following parameters:
group: group-23
config: x86_64-rhel-9.4-func
compiler: gcc-12
test machine: 20 threads 1 sockets (Commet Lake) with 16G memory
(please refer to attached dmesg/kmsg for entire log/backtrace)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202503211421.fc83271a-lkp@intel.com
[ 0.995938][ T0] ------------[ cut here ]------------
[ 0.995938][ T0] WARNING: CPU: 4 PID: 0 at arch/x86/kernel/cpu/cpuid-deps.c:118 do_clear_cpu_cap (arch/x86/kernel/cpu/cpuid-deps.c:118 (discriminator 1))
[ 0.995938][ T0] Modules linked in:
[ 0.995938][ T0] CPU: 4 UID: 0 PID: 0 Comm: swapper/4 Not tainted 6.14.0-rc5-00157-g1a98daa004bc #1
[ 0.995938][ T0] RIP: 0010:do_clear_cpu_cap (arch/x86/kernel/cpu/cpuid-deps.c:118 (discriminator 1))
[ 0.995938][ T0] Code: 89 c1 83 e0 07 48 c1 e9 03 83 c0 03 0f b6 14 11 38 d0 7c 08 84 d2 0f 85 a8 00 00 00 8b 15 d1 f2 01 05 85 d2 0f 84 ee fd ff ff <0f> 0b e9 e7 fd ff ff 48 c7 c7 40 c3 33 86 e8 4e fd ff ff 49 8d bf
All code
========
0: 89 c1 mov %eax,%ecx
2: 83 e0 07 and $0x7,%eax
5: 48 c1 e9 03 shr $0x3,%rcx
9: 83 c0 03 add $0x3,%eax
c: 0f b6 14 11 movzbl (%rcx,%rdx,1),%edx
10: 38 d0 cmp %dl,%al
12: 7c 08 jl 0x1c
14: 84 d2 test %dl,%dl
16: 0f 85 a8 00 00 00 jne 0xc4
1c: 8b 15 d1 f2 01 05 mov 0x501f2d1(%rip),%edx # 0x501f2f3
22: 85 d2 test %edx,%edx
24: 0f 84 ee fd ff ff je 0xfffffffffffffe18
2a:* 0f 0b ud2 <-- trapping instruction
2c: e9 e7 fd ff ff jmp 0xfffffffffffffe18
31: 48 c7 c7 40 c3 33 86 mov $0xffffffff8633c340,%rdi
38: e8 4e fd ff ff call 0xfffffffffffffd8b
3d: 49 rex.WB
3e: 8d .byte 0x8d
3f: bf .byte 0xbf
Code starting with the faulting instruction
===========================================
0: 0f 0b ud2
2: e9 e7 fd ff ff jmp 0xfffffffffffffdee
7: 48 c7 c7 40 c3 33 86 mov $0xffffffff8633c340,%rdi
e: e8 4e fd ff ff call 0xfffffffffffffd61
13: 49 rex.WB
14: 8d .byte 0x8d
15: bf .byte 0xbf
[ 0.995938][ T0] RSP: 0000:ffffc90000237d20 EFLAGS: 00010002
[ 0.995938][ T0] RAX: 0000000000000003 RBX: ff656fe452696248 RCX: 1ffffffff0c67895
[ 0.995938][ T0] RDX: 0000000000000001 RSI: 0000000000000132 RDI: ffff8883e2e28060
[ 0.995938][ T0] RBP: ffff8883e2e28060 R08: ffff8883e2e28060 R09: fffffbfff0ef9094
[ 0.995938][ T0] R10: ffffffff877c84a3 R11: 0000000000000001 R12: 0000000000000132
[ 0.995938][ T0] R13: ffffffff873f8440 R14: ffffffff873f83c0 R15: ffff8883e2e28090
[ 0.995938][ T0] FS: 0000000000000000(0000) GS:ffff88845ba72000(0000) knlGS:0000000000000000
[ 0.995938][ T0] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 0.995938][ T0] CR2: 0000000000000000 CR3: 000000045b26c001 CR4: 00000000003706b0
[ 0.995938][ T0] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 0.995938][ T0] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 0.995938][ T0] Call Trace:
[ 0.995938][ T0] <TASK>
[ 0.995938][ T0] ? __warn (kernel/panic.c:748)
[ 0.995938][ T0] ? do_clear_cpu_cap (arch/x86/kernel/cpu/cpuid-deps.c:118 (discriminator 1))
[ 0.995938][ T0] ? report_bug (lib/bug.c:180 lib/bug.c:219)
[ 0.995938][ T0] ? do_clear_cpu_cap (arch/x86/kernel/cpu/cpuid-deps.c:118 (discriminator 1))
[ 0.995938][ T0] ? handle_bug (arch/x86/kernel/traps.c:337)
[ 0.995938][ T0] ? exc_invalid_op (arch/x86/kernel/traps.c:391 (discriminator 1))
[ 0.995938][ T0] ? asm_exc_invalid_op (arch/x86/include/asm/idtentry.h:574)
[ 0.995938][ T0] ? do_clear_cpu_cap (arch/x86/kernel/cpu/cpuid-deps.c:118 (discriminator 1))
[ 0.995938][ T0] ? __pfx_do_clear_cpu_cap (arch/x86/kernel/cpu/cpuid-deps.c:109)
[ 0.995938][ T0] ? init_ia32_feat_ctl (arch/x86/kernel/cpu/feat_ctl.c:189)
[ 0.995938][ T0] x86_init_rdseed (arch/x86/kernel/cpu/rdrand.c:85)
[ 0.995938][ T0] identify_cpu (arch/x86/kernel/cpu/common.c:519 arch/x86/kernel/cpu/common.c:1922)
[ 0.995938][ T0] identify_secondary_cpu (arch/x86/kernel/cpu/common.c:2014)
[ 0.995938][ T0] start_secondary (arch/x86/kernel/smpboot.c:199 arch/x86/kernel/smpboot.c:283)
[ 0.995938][ T0] ? __pfx_start_secondary (arch/x86/kernel/smpboot.c:233)
[ 0.995938][ T0] ? startup_64_load_idt (arch/x86/include/asm/desc.h:215 arch/x86/kernel/head64.c:549)
[ 0.995938][ T0] common_startup_64 (arch/x86/kernel/head_64.S:419)
[ 0.995938][ T0] </TASK>
[ 0.995938][ T0] ---[ end trace 0000000000000000 ]---
[ 0.995938][ T0] RDSEED is not reliable on this platform; disabling.
[ 0.995938][ T0] Masked ExtINT on CPU#5
[ 0.995938][ T0] Masked ExtINT on CPU#6
[ 0.995938][ T0] RDSEED is not reliable on this platform; disabling.
[ 0.995938][ T0] Masked ExtINT on CPU#7
[ 0.995938][ T0] RDSEED is not reliable on this platform; disabling.
[ 0.995938][ T0] Masked ExtINT on CPU#8
[ 0.995938][ T0] RDSEED is not reliable on this platform; disabling.
[ 0.995938][ T0] Masked ExtINT on CPU#9
[ 0.995938][ T0] RDSEED is not reliable on this platform; disabling.
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250321/202503211421.fc83271a-lkp@intel.com
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
On Wed, Mar 12, 2025 at 03:31:29PM +0300, Mikhail Paulyshka wrote:
> +/*
> + * RDSEED has Built-In-Self-Test (BIST) that runs on every invocation.
> + * Run the instruction a few times as a sanity check. Also make sure
> + * it's not outputting the same value over and over, which has happened
> + * as a result of past CPU bugs.
> + *
> + * If it fails, it is simple to disable RDSEED here.
> + */
> +
> +void x86_init_rdseed(struct cpuinfo_x86 *c)
> +{
> + unsigned long sample, prev;
> + bool failure = false;
> + size_t i, changed;
> +
> + if (!cpu_has(c, X86_FEATURE_RDSEED))
> + return;
> +
> + for (changed = 0, i = 0; i < SAMPLES; ++i) {
> + if (!rdseed_long(&sample)) {
> + failure = true;
> + break;
> + }
> + changed += i && sample != prev;
> + prev = sample;
> + }
> + if (changed < MIN_CHANGE)
> + failure = true;
> +
> + if (failure) {
> + clear_cpu_cap(c, X86_FEATURE_RDSEED);
> + pr_emerg("RDSEED is not reliable on this platform; disabling.\n");
> + }
> +}
This one basically duplicates x86_init_rdrand() and I'm sure you can use
a single function to test both.
But more importantly, lemme ask around internally whether that is even
a reliable test to detect RDSEED performs properly or not.
Stay tuned...
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
© 2016 - 2025 Red Hat, Inc.