[PATCH] x86/fred: Fix early boot failures on SEV-ES/SNP guests

Nikunj A Dadhania posted 1 patch 2 days, 9 hours ago
arch/x86/coco/sev/noinstr.c |  6 ++++++
arch/x86/entry/entry_fred.c |  5 +++++
arch/x86/kernel/fred.c      | 14 +++++++++++---
3 files changed, 22 insertions(+), 3 deletions(-)
[PATCH] x86/fred: Fix early boot failures on SEV-ES/SNP guests
Posted by Nikunj A Dadhania 2 days, 9 hours ago
FRED enabled SEV-ES and SNP guests fail to boot due to the following
issues in the early boot sequence:

* FRED does not have a #VC exception handler in the dispatch logic

* For secondary CPUs, FRED is enabled before setting up the FRED MSRs, and
  console output triggers a #VC which cannot be handled

* Early FRED #VC exceptions should use boot_ghcb until per-CPU GHCBs are
  initialized

Fix these issues to ensure SEV-ES/SNP guests can handle #VC exceptions
correctly during early boot when FRED is enabled.

Fixes: 14619d912b65 ("x86/fred: FRED entry/exit and dispatch code")
Cc: stable@vger.kernel.org # 6.9+
Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
---

Reason to add stable tag:

With FRED support for SVM here 
https://lore.kernel.org/kvm/20260129063653.3553076-1-shivansh.dhiman@amd.com,
SVM and SEV guests running 6.9 and later kernels will support FRED.
However, *SEV-ES and SNP guests cannot support FRED* and will fail to boot
with the following error:

    [    0.005144] Using GB pages for direct mapping
    [    0.008402] Initialize FRED on CPU0
    qemu-system-x86_64: cpus are not resettable, terminating

Three problems were identified as detailed in the commit message above and
is fixed with this patch.

I would like the patch to be backported to the LTS kernels (6.12 and 6.18) to
ensure SEV-ES and SNP guests running these stable kernel versions can boot
with FRED enabled on FRED-enabled hypervisors.

---

 arch/x86/coco/sev/noinstr.c |  6 ++++++
 arch/x86/entry/entry_fred.c |  5 +++++
 arch/x86/kernel/fred.c      | 14 +++++++++++---
 3 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/arch/x86/coco/sev/noinstr.c b/arch/x86/coco/sev/noinstr.c
index 9d94aca4a698..5afd663a1c21 100644
--- a/arch/x86/coco/sev/noinstr.c
+++ b/arch/x86/coco/sev/noinstr.c
@@ -121,6 +121,9 @@ noinstr struct ghcb *__sev_get_ghcb(struct ghcb_state *state)
 
 	WARN_ON(!irqs_disabled());
 
+	if (!sev_cfg.ghcbs_initialized)
+		return boot_ghcb;
+
 	data = this_cpu_read(runtime_data);
 	ghcb = &data->ghcb_page;
 
@@ -164,6 +167,9 @@ noinstr void __sev_put_ghcb(struct ghcb_state *state)
 
 	WARN_ON(!irqs_disabled());
 
+	if (!sev_cfg.ghcbs_initialized)
+		return;
+
 	data = this_cpu_read(runtime_data);
 	ghcb = &data->ghcb_page;
 
diff --git a/arch/x86/entry/entry_fred.c b/arch/x86/entry/entry_fred.c
index a9b72997103d..7a8659f19441 100644
--- a/arch/x86/entry/entry_fred.c
+++ b/arch/x86/entry/entry_fred.c
@@ -208,6 +208,11 @@ static noinstr void fred_hwexc(struct pt_regs *regs, unsigned long error_code)
 #ifdef CONFIG_X86_CET
 	case X86_TRAP_CP: return exc_control_protection(regs, error_code);
 #endif
+	case X86_TRAP_VC:
+		if (user_mode(regs))
+			return user_exc_vmm_communication(regs, error_code);
+		else
+			return kernel_exc_vmm_communication(regs, error_code);
 	default: return fred_bad_type(regs, error_code);
 	}
 
diff --git a/arch/x86/kernel/fred.c b/arch/x86/kernel/fred.c
index e736b19e18de..8cf4da546a8e 100644
--- a/arch/x86/kernel/fred.c
+++ b/arch/x86/kernel/fred.c
@@ -27,9 +27,6 @@ EXPORT_PER_CPU_SYMBOL(fred_rsp0);
 
 void cpu_init_fred_exceptions(void)
 {
-	/* When FRED is enabled by default, remove this log message */
-	pr_info("Initialize FRED on CPU%d\n", smp_processor_id());
-
 	/*
 	 * If a kernel event is delivered before a CPU goes to user level for
 	 * the first time, its SS is NULL thus NULL is pushed into the SS field
@@ -70,6 +67,17 @@ void cpu_init_fred_exceptions(void)
 	/* Use int $0x80 for 32-bit system calls in FRED mode */
 	setup_clear_cpu_cap(X86_FEATURE_SYSFAST32);
 	setup_clear_cpu_cap(X86_FEATURE_SYSCALL32);
+
+	/*
+	 * For secondary processors, FRED bit in CR4 gets enabled in cr4_init()
+	 * and FRED MSRs are not configured till the end of this function. For
+	 * SEV-ES and SNP guests, any console write before the FRED MSRs are
+	 * setup will cause a #VC and cannot be handled. Move the pr_info to
+	 * the end of this function.
+	 *
+	 * When FRED is enabled by default, remove this log message
+	 */
+	pr_info("Initialized FRED on CPU%d\n", smp_processor_id());
 }
 
 /* Must be called after setup_cpu_entry_areas() */

base-commit: 3c2ca964f75460093a8aad6b314a6cd558e80e66
-- 
2.48.1
Re: [PATCH] x86/fred: Fix early boot failures on SEV-ES/SNP guests
Posted by Dave Hansen 1 day, 22 hours ago
On 2/4/26 21:10, Nikunj A Dadhania wrote:
...
> --- a/arch/x86/entry/entry_fred.c
> +++ b/arch/x86/entry/entry_fred.c
> @@ -208,6 +208,11 @@ static noinstr void fred_hwexc(struct pt_regs *regs, unsigned long error_code)
>  #ifdef CONFIG_X86_CET
>  	case X86_TRAP_CP: return exc_control_protection(regs, error_code);
>  #endif
> +	case X86_TRAP_VC:
> +		if (user_mode(regs))
> +			return user_exc_vmm_communication(regs, error_code);
> +		else
> +			return kernel_exc_vmm_communication(regs, error_code);
>  	default: return fred_bad_type(regs, error_code);
>  	}

Please look at the code in the ~20 lines above this hunk. It has a nice,
consistent form of:

	case X86_TRAP_FOO: return exc_foo_action(...);

Could we keep that going, please?

Second, these functions are defined in arch/x86/coco/sev/vc-handle.c.
That looks suspiciously like CONFIG_AMD_MEM_ENCRYPT code and not
something that will compile everywhere. Also note the other features in
the switch() block. See all the #ifdefs on those?

Have you compiled this?

> diff --git a/arch/x86/kernel/fred.c b/arch/x86/kernel/fred.c
> index e736b19e18de..8cf4da546a8e 100644
> --- a/arch/x86/kernel/fred.c
> +++ b/arch/x86/kernel/fred.c
> @@ -27,9 +27,6 @@ EXPORT_PER_CPU_SYMBOL(fred_rsp0);
>  
>  void cpu_init_fred_exceptions(void)
>  {
> -	/* When FRED is enabled by default, remove this log message */
> -	pr_info("Initialize FRED on CPU%d\n", smp_processor_id());
> -
>  	/*
>  	 * If a kernel event is delivered before a CPU goes to user level for
>  	 * the first time, its SS is NULL thus NULL is pushed into the SS field
> @@ -70,6 +67,17 @@ void cpu_init_fred_exceptions(void)
>  	/* Use int $0x80 for 32-bit system calls in FRED mode */
>  	setup_clear_cpu_cap(X86_FEATURE_SYSFAST32);
>  	setup_clear_cpu_cap(X86_FEATURE_SYSCALL32);
> +
> +	/*
> +	 * For secondary processors, FRED bit in CR4 gets enabled in cr4_init()
> +	 * and FRED MSRs are not configured till the end of this function. For
> +	 * SEV-ES and SNP guests, any console write before the FRED MSRs are
> +	 * setup will cause a #VC and cannot be handled. Move the pr_info to
> +	 * the end of this function.
> +	 *
> +	 * When FRED is enabled by default, remove this log message
> +	 */
> +	pr_info("Initialized FRED on CPU%d\n", smp_processor_id());
>  }

This seems really gross. Now there's a window where printk() doesn't
work. To fix it, we start moving printk()'s?

Please, no.

Shouldn't we flip the FRED CR4 bit _last_, once all the MSRs are set up?
Why is it backwards in the first place? Why can't it be fixed?
Re: [PATCH] x86/fred: Fix early boot failures on SEV-ES/SNP guests
Posted by Nikunj A. Dadhania 1 day, 2 hours ago

On 2/5/2026 9:40 PM, Dave Hansen wrote:
> On 2/4/26 21:10, Nikunj A Dadhania wrote:
> ...
>> --- a/arch/x86/entry/entry_fred.c
>> +++ b/arch/x86/entry/entry_fred.c
>> @@ -208,6 +208,11 @@ static noinstr void fred_hwexc(struct pt_regs *regs, unsigned long error_code)
>>  #ifdef CONFIG_X86_CET
>>  	case X86_TRAP_CP: return exc_control_protection(regs, error_code);
>>  #endif
>> +	case X86_TRAP_VC:
>> +		if (user_mode(regs))
>> +			return user_exc_vmm_communication(regs, error_code);
>> +		else
>> +			return kernel_exc_vmm_communication(regs, error_code);
>>  	default: return fred_bad_type(regs, error_code);
>>  	}
> 
> Please look at the code in the ~20 lines above this hunk. It has a nice,
> consistent form of:
> 
> 	case X86_TRAP_FOO: return exc_foo_action(...);
> 
> Could we keep that going, please?

There are couple of options, I will test and get back.

> Second, these functions are defined in arch/x86/coco/sev/vc-handle.c.
> That looks suspiciously like CONFIG_AMD_MEM_ENCRYPT code and not
> something that will compile everywhere. Also note the other features in
> the switch() block. See all the #ifdefs on those?
> 
> Have you compiled this?

Compiled and tested. I missed testing without CONFIG_AMD_MEM_ENCRYPT, will add.

Regards,
Nikunj
Re: [PATCH] x86/fred: Fix early boot failures on SEV-ES/SNP guests
Posted by Dave Hansen 1 day, 21 hours ago
On 2/5/26 08:10, Dave Hansen wrote:
> Shouldn't we flip the FRED CR4 bit _last_, once all the MSRs are set up?
> Why is it backwards in the first place? Why can't it be fixed?

Ahhh, it was done by CR4 pinning. It's the first thing in C code for
booting secondaries:

static void notrace __noendbr start_secondary(void *unused)
{
        cr4_init();

Since FRED is set in 'cr4_pinned_mask', cr4_init() sets the FRED bit far
before the FRED MSRs are ready. Anyone else doing native_write_cr4()
will do the same thing. That's obviously not what was intended from the
pinning code or the FRED init code.

Shouldn't we fix this properly rather than moving printk()'s around?

One idea is just to turn off all the CR-pinning logic while bringing
CPUs up. That way, nothing before:

	set_cpu_online(smp_processor_id(), true);

can get tripped up by CR pinning. I've attached a completely untested
patch to do that.

The other thing would be to make pinning actually per-cpu:
'cr4_pinned_bits' could be per-cpu and we'd just keep it empty until the
CPU is actually booted and everything is fully set up.

Either way, this is looking like it'll be a bit more than one patch to
do properly.
Re: [PATCH] x86/fred: Fix early boot failures on SEV-ES/SNP guests
Posted by Tom Lendacky 1 day, 21 hours ago
On 2/5/26 11:20, Dave Hansen wrote:
> On 2/5/26 08:10, Dave Hansen wrote:
>> Shouldn't we flip the FRED CR4 bit _last_, once all the MSRs are set up?
>> Why is it backwards in the first place? Why can't it be fixed?
> 
> Ahhh, it was done by CR4 pinning. It's the first thing in C code for
> booting secondaries:
> 
> static void notrace __noendbr start_secondary(void *unused)
> {
>         cr4_init();
> 
> Since FRED is set in 'cr4_pinned_mask', cr4_init() sets the FRED bit far
> before the FRED MSRs are ready. Anyone else doing native_write_cr4()
> will do the same thing. That's obviously not what was intended from the
> pinning code or the FRED init code.
> 
> Shouldn't we fix this properly rather than moving printk()'s around?

I believe that is what this part of the thread decided on:

https://lore.kernel.org/kvm/02df7890-83c2-4047-8c88-46fbc6e0a892@intel.com/T/#m3e44c2c53aca3bcd872de4ce1e50a14500e62e4e

Thanks,
Tom

> 
> One idea is just to turn off all the CR-pinning logic while bringing
> CPUs up. That way, nothing before:
> 
> 	set_cpu_online(smp_processor_id(), true);
> 
> can get tripped up by CR pinning. I've attached a completely untested
> patch to do that.
> 
> The other thing would be to make pinning actually per-cpu:
> 'cr4_pinned_bits' could be per-cpu and we'd just keep it empty until the
> CPU is actually booted and everything is fully set up.
> 
> Either way, this is looking like it'll be a bit more than one patch to
> do properly.
Re: [PATCH] x86/fred: Fix early boot failures on SEV-ES/SNP guests
Posted by Nikunj A. Dadhania 1 day, 2 hours ago

On 2/5/2026 11:09 PM, Tom Lendacky wrote:
> On 2/5/26 11:20, Dave Hansen wrote:
>> On 2/5/26 08:10, Dave Hansen wrote:
>>> Shouldn't we flip the FRED CR4 bit _last_, once all the MSRs are set up?
>>> Why is it backwards in the first place? Why can't it be fixed?
>>
>> Ahhh, it was done by CR4 pinning. It's the first thing in C code for
>> booting secondaries:
>>
>> static void notrace __noendbr start_secondary(void *unused)
>> {
>>         cr4_init();
>>
>> Since FRED is set in 'cr4_pinned_mask', cr4_init() sets the FRED bit far
>> before the FRED MSRs are ready. Anyone else doing native_write_cr4()
>> will do the same thing. That's obviously not what was intended from the
>> pinning code or the FRED init code.
>>
>> Shouldn't we fix this properly rather than moving printk()'s around?
> 
> I believe that is what this part of the thread decided on:
> 
> https://lore.kernel.org/kvm/02df7890-83c2-4047-8c88-46fbc6e0a892@intel.com/T/#m3e44c2c53aca3bcd872de4ce1e50a14500e62e4e
> 
> Thanks,
> Tom
> 
>>
>> One idea is just to turn off all the CR-pinning logic while bringing
>> CPUs up. That way, nothing before:
>>
>> 	set_cpu_online(smp_processor_id(), true);
>>
>> can get tripped up by CR pinning. I've attached a completely untested
>> patch to do that.

Yes, this works as well. And Xin Li's patch also resolves the issue by
moving the cr4_init() later after initializing FRED MSRs.

Regards,
Nikunj
Re: [PATCH] x86/fred: Fix early boot failures on SEV-ES/SNP guests
Posted by H. Peter Anvin 1 day, 21 hours ago
On February 5, 2026 9:20:20 AM PST, Dave Hansen <dave.hansen@intel.com> wrote:
>On 2/5/26 08:10, Dave Hansen wrote:
>> Shouldn't we flip the FRED CR4 bit _last_, once all the MSRs are set up?
>> Why is it backwards in the first place? Why can't it be fixed?
>
>Ahhh, it was done by CR4 pinning. It's the first thing in C code for
>booting secondaries:
>
>static void notrace __noendbr start_secondary(void *unused)
>{
>        cr4_init();
>
>Since FRED is set in 'cr4_pinned_mask', cr4_init() sets the FRED bit far
>before the FRED MSRs are ready. Anyone else doing native_write_cr4()
>will do the same thing. That's obviously not what was intended from the
>pinning code or the FRED init code.
>
>Shouldn't we fix this properly rather than moving printk()'s around?
>
>One idea is just to turn off all the CR-pinning logic while bringing
>CPUs up. That way, nothing before:
>
>	set_cpu_online(smp_processor_id(), true);
>
>can get tripped up by CR pinning. I've attached a completely untested
>patch to do that.
>
>The other thing would be to make pinning actually per-cpu:
>'cr4_pinned_bits' could be per-cpu and we'd just keep it empty until the
>CPU is actually booted and everything is fully set up.
>
>Either way, this is looking like it'll be a bit more than one patch to
>do properly.
We could initialize the FRED MSRs much earlier, like we do during S3 resume.
Re: [PATCH] x86/fred: Fix early boot failures on SEV-ES/SNP guests
Posted by kernel test robot 2 days, 2 hours ago
Hi Nikunj,

kernel test robot noticed the following build errors:

[auto build test ERROR on 3c2ca964f75460093a8aad6b314a6cd558e80e66]

url:    https://github.com/intel-lab-lkp/linux/commits/Nikunj-A-Dadhania/x86-fred-Fix-early-boot-failures-on-SEV-ES-SNP-guests/20260205-131359
base:   3c2ca964f75460093a8aad6b314a6cd558e80e66
patch link:    https://lore.kernel.org/r/20260205051030.1225975-1-nikunj%40amd.com
patch subject: [PATCH] x86/fred: Fix early boot failures on SEV-ES/SNP guests
config: x86_64-kexec (https://download.01.org/0day-ci/archive/20260205/202602052054.J3CEkmKB-lkp@intel.com/config)
compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260205/202602052054.J3CEkmKB-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202602052054.J3CEkmKB-lkp@intel.com/

All errors (new ones prefixed by >>):

>> arch/x86/entry/entry_fred.c:213:11: error: call to undeclared function 'user_exc_vmm_communication'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
     213 |                         return user_exc_vmm_communication(regs, error_code);
         |                                ^
   arch/x86/entry/entry_fred.c:213:4: warning: void function 'fred_hwexc' should not return a value [-Wreturn-mismatch]
     213 |                         return user_exc_vmm_communication(regs, error_code);
         |                         ^      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> arch/x86/entry/entry_fred.c:215:11: error: call to undeclared function 'kernel_exc_vmm_communication'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
     215 |                         return kernel_exc_vmm_communication(regs, error_code);
         |                                ^
   arch/x86/entry/entry_fred.c:215:4: warning: void function 'fred_hwexc' should not return a value [-Wreturn-mismatch]
     215 |                         return kernel_exc_vmm_communication(regs, error_code);
         |                         ^      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   2 warnings and 2 errors generated.


vim +/user_exc_vmm_communication +213 arch/x86/entry/entry_fred.c

   180	
   181	static noinstr void fred_hwexc(struct pt_regs *regs, unsigned long error_code)
   182	{
   183		/* Optimize for #PF. That's the only exception which matters performance wise */
   184		if (likely(regs->fred_ss.vector == X86_TRAP_PF))
   185			return exc_page_fault(regs, error_code);
   186	
   187		switch (regs->fred_ss.vector) {
   188		case X86_TRAP_DE: return exc_divide_error(regs);
   189		case X86_TRAP_DB: return fred_exc_debug(regs);
   190		case X86_TRAP_BR: return exc_bounds(regs);
   191		case X86_TRAP_UD: return exc_invalid_op(regs);
   192		case X86_TRAP_NM: return exc_device_not_available(regs);
   193		case X86_TRAP_DF: return exc_double_fault(regs, error_code);
   194		case X86_TRAP_TS: return exc_invalid_tss(regs, error_code);
   195		case X86_TRAP_NP: return exc_segment_not_present(regs, error_code);
   196		case X86_TRAP_SS: return exc_stack_segment(regs, error_code);
   197		case X86_TRAP_GP: return exc_general_protection(regs, error_code);
   198		case X86_TRAP_MF: return exc_coprocessor_error(regs);
   199		case X86_TRAP_AC: return exc_alignment_check(regs, error_code);
   200		case X86_TRAP_XF: return exc_simd_coprocessor_error(regs);
   201	
   202	#ifdef CONFIG_X86_MCE
   203		case X86_TRAP_MC: return fred_exc_machine_check(regs);
   204	#endif
   205	#ifdef CONFIG_INTEL_TDX_GUEST
   206		case X86_TRAP_VE: return exc_virtualization_exception(regs);
   207	#endif
   208	#ifdef CONFIG_X86_CET
   209		case X86_TRAP_CP: return exc_control_protection(regs, error_code);
   210	#endif
   211		case X86_TRAP_VC:
   212			if (user_mode(regs))
 > 213				return user_exc_vmm_communication(regs, error_code);
   214			else
 > 215				return kernel_exc_vmm_communication(regs, error_code);
   216		default: return fred_bad_type(regs, error_code);
   217		}
   218	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Re: [PATCH] x86/fred: Fix early boot failures on SEV-ES/SNP guests
Posted by kernel test robot 2 days, 2 hours ago
Hi Nikunj,

kernel test robot noticed the following build errors:

[auto build test ERROR on 3c2ca964f75460093a8aad6b314a6cd558e80e66]

url:    https://github.com/intel-lab-lkp/linux/commits/Nikunj-A-Dadhania/x86-fred-Fix-early-boot-failures-on-SEV-ES-SNP-guests/20260205-131359
base:   3c2ca964f75460093a8aad6b314a6cd558e80e66
patch link:    https://lore.kernel.org/r/20260205051030.1225975-1-nikunj%40amd.com
patch subject: [PATCH] x86/fred: Fix early boot failures on SEV-ES/SNP guests
config: x86_64-randconfig-161-20260205 (https://download.01.org/0day-ci/archive/20260205/202602052058.qsluYkXo-lkp@intel.com/config)
compiler: gcc-14 (Debian 14.2.0-19) 14.2.0
smatch version: v0.5.0-8994-gd50c5a4c
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260205/202602052058.qsluYkXo-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202602052058.qsluYkXo-lkp@intel.com/

All errors (new ones prefixed by >>):

   arch/x86/entry/entry_fred.c: In function 'fred_hwexc':
>> arch/x86/entry/entry_fred.c:213:32: error: implicit declaration of function 'user_exc_vmm_communication' [-Wimplicit-function-declaration]
     213 |                         return user_exc_vmm_communication(regs, error_code);
         |                                ^~~~~~~~~~~~~~~~~~~~~~~~~~
>> arch/x86/entry/entry_fred.c:213:32: error: 'return' with a value, in function returning void [-Wreturn-mismatch]
     213 |                         return user_exc_vmm_communication(regs, error_code);
         |                                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   arch/x86/entry/entry_fred.c:181:21: note: declared here
     181 | static noinstr void fred_hwexc(struct pt_regs *regs, unsigned long error_code)
         |                     ^~~~~~~~~~
>> arch/x86/entry/entry_fred.c:215:32: error: implicit declaration of function 'kernel_exc_vmm_communication' [-Wimplicit-function-declaration]
     215 |                         return kernel_exc_vmm_communication(regs, error_code);
         |                                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
   arch/x86/entry/entry_fred.c:215:32: error: 'return' with a value, in function returning void [-Wreturn-mismatch]
     215 |                         return kernel_exc_vmm_communication(regs, error_code);
         |                                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   arch/x86/entry/entry_fred.c:181:21: note: declared here
     181 | static noinstr void fred_hwexc(struct pt_regs *regs, unsigned long error_code)
         |                     ^~~~~~~~~~


vim +/user_exc_vmm_communication +213 arch/x86/entry/entry_fred.c

   180	
   181	static noinstr void fred_hwexc(struct pt_regs *regs, unsigned long error_code)
   182	{
   183		/* Optimize for #PF. That's the only exception which matters performance wise */
   184		if (likely(regs->fred_ss.vector == X86_TRAP_PF))
   185			return exc_page_fault(regs, error_code);
   186	
   187		switch (regs->fred_ss.vector) {
   188		case X86_TRAP_DE: return exc_divide_error(regs);
   189		case X86_TRAP_DB: return fred_exc_debug(regs);
   190		case X86_TRAP_BR: return exc_bounds(regs);
   191		case X86_TRAP_UD: return exc_invalid_op(regs);
   192		case X86_TRAP_NM: return exc_device_not_available(regs);
   193		case X86_TRAP_DF: return exc_double_fault(regs, error_code);
   194		case X86_TRAP_TS: return exc_invalid_tss(regs, error_code);
   195		case X86_TRAP_NP: return exc_segment_not_present(regs, error_code);
   196		case X86_TRAP_SS: return exc_stack_segment(regs, error_code);
   197		case X86_TRAP_GP: return exc_general_protection(regs, error_code);
   198		case X86_TRAP_MF: return exc_coprocessor_error(regs);
   199		case X86_TRAP_AC: return exc_alignment_check(regs, error_code);
   200		case X86_TRAP_XF: return exc_simd_coprocessor_error(regs);
   201	
   202	#ifdef CONFIG_X86_MCE
   203		case X86_TRAP_MC: return fred_exc_machine_check(regs);
   204	#endif
   205	#ifdef CONFIG_INTEL_TDX_GUEST
   206		case X86_TRAP_VE: return exc_virtualization_exception(regs);
   207	#endif
   208	#ifdef CONFIG_X86_CET
   209		case X86_TRAP_CP: return exc_control_protection(regs, error_code);
   210	#endif
   211		case X86_TRAP_VC:
   212			if (user_mode(regs))
 > 213				return user_exc_vmm_communication(regs, error_code);
   214			else
 > 215				return kernel_exc_vmm_communication(regs, error_code);
   216		default: return fred_bad_type(regs, error_code);
   217		}
   218	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Re: [PATCH] x86/fred: Fix early boot failures on SEV-ES/SNP guests
Posted by kernel test robot 2 days, 4 hours ago
Hi Nikunj,

kernel test robot noticed the following build warnings:

[auto build test WARNING on 3c2ca964f75460093a8aad6b314a6cd558e80e66]

url:    https://github.com/intel-lab-lkp/linux/commits/Nikunj-A-Dadhania/x86-fred-Fix-early-boot-failures-on-SEV-ES-SNP-guests/20260205-131359
base:   3c2ca964f75460093a8aad6b314a6cd558e80e66
patch link:    https://lore.kernel.org/r/20260205051030.1225975-1-nikunj%40amd.com
patch subject: [PATCH] x86/fred: Fix early boot failures on SEV-ES/SNP guests
config: x86_64-kexec (https://download.01.org/0day-ci/archive/20260205/202602051859.vGTf24Nk-lkp@intel.com/config)
compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260205/202602051859.vGTf24Nk-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202602051859.vGTf24Nk-lkp@intel.com/

All warnings (new ones prefixed by >>):

   arch/x86/entry/entry_fred.c:213:11: error: call to undeclared function 'user_exc_vmm_communication'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
     213 |                         return user_exc_vmm_communication(regs, error_code);
         |                                ^
>> arch/x86/entry/entry_fred.c:213:4: warning: void function 'fred_hwexc' should not return a value [-Wreturn-mismatch]
     213 |                         return user_exc_vmm_communication(regs, error_code);
         |                         ^      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   arch/x86/entry/entry_fred.c:215:11: error: call to undeclared function 'kernel_exc_vmm_communication'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
     215 |                         return kernel_exc_vmm_communication(regs, error_code);
         |                                ^
   arch/x86/entry/entry_fred.c:215:4: warning: void function 'fred_hwexc' should not return a value [-Wreturn-mismatch]
     215 |                         return kernel_exc_vmm_communication(regs, error_code);
         |                         ^      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   2 warnings and 2 errors generated.


vim +/fred_hwexc +213 arch/x86/entry/entry_fred.c

   180	
   181	static noinstr void fred_hwexc(struct pt_regs *regs, unsigned long error_code)
   182	{
   183		/* Optimize for #PF. That's the only exception which matters performance wise */
   184		if (likely(regs->fred_ss.vector == X86_TRAP_PF))
   185			return exc_page_fault(regs, error_code);
   186	
   187		switch (regs->fred_ss.vector) {
   188		case X86_TRAP_DE: return exc_divide_error(regs);
   189		case X86_TRAP_DB: return fred_exc_debug(regs);
   190		case X86_TRAP_BR: return exc_bounds(regs);
   191		case X86_TRAP_UD: return exc_invalid_op(regs);
   192		case X86_TRAP_NM: return exc_device_not_available(regs);
   193		case X86_TRAP_DF: return exc_double_fault(regs, error_code);
   194		case X86_TRAP_TS: return exc_invalid_tss(regs, error_code);
   195		case X86_TRAP_NP: return exc_segment_not_present(regs, error_code);
   196		case X86_TRAP_SS: return exc_stack_segment(regs, error_code);
   197		case X86_TRAP_GP: return exc_general_protection(regs, error_code);
   198		case X86_TRAP_MF: return exc_coprocessor_error(regs);
   199		case X86_TRAP_AC: return exc_alignment_check(regs, error_code);
   200		case X86_TRAP_XF: return exc_simd_coprocessor_error(regs);
   201	
   202	#ifdef CONFIG_X86_MCE
   203		case X86_TRAP_MC: return fred_exc_machine_check(regs);
   204	#endif
   205	#ifdef CONFIG_INTEL_TDX_GUEST
   206		case X86_TRAP_VE: return exc_virtualization_exception(regs);
   207	#endif
   208	#ifdef CONFIG_X86_CET
   209		case X86_TRAP_CP: return exc_control_protection(regs, error_code);
   210	#endif
   211		case X86_TRAP_VC:
   212			if (user_mode(regs))
 > 213				return user_exc_vmm_communication(regs, error_code);
   214			else
   215				return kernel_exc_vmm_communication(regs, error_code);
   216		default: return fred_bad_type(regs, error_code);
   217		}
   218	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Re: [PATCH] x86/fred: Fix early boot failures on SEV-ES/SNP guests
Posted by Nikunj A. Dadhania 1 day, 11 hours ago
On 2/5/2026 4:11 PM, kernel test robot wrote:
> Hi Nikunj,
> 
> kernel test robot noticed the following build warnings:
> 
> [auto build test WARNING on 3c2ca964f75460093a8aad6b314a6cd558e80e66]
> 
> url:    https://github.com/intel-lab-lkp/linux/commits/Nikunj-A-Dadhania/x86-fred-Fix-early-boot-failures-on-SEV-ES-SNP-guests/20260205-131359
> base:   3c2ca964f75460093a8aad6b314a6cd558e80e66
> patch link:    https://lore.kernel.org/r/20260205051030.1225975-1-nikunj%40amd.com
> patch subject: [PATCH] x86/fred: Fix early boot failures on SEV-ES/SNP guests
> config: x86_64-kexec (https://download.01.org/0day-ci/archive/20260205/202602051859.vGTf24Nk-lkp@intel.com/config)
> compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260205/202602051859.vGTf24Nk-lkp@intel.com/reproduce)
> 
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <lkp@intel.com>
> | Closes: https://lore.kernel.org/oe-kbuild-all/202602051859.vGTf24Nk-lkp@intel.com/
> 
> All warnings (new ones prefixed by >>):
> 
>    arch/x86/entry/entry_fred.c:213:11: error: call to undeclared function 'user_exc_vmm_communication'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
>      213 |                         return user_exc_vmm_communication(regs, error_code);
>          |                                ^
>>> arch/x86/entry/entry_fred.c:213:4: warning: void function 'fred_hwexc' should not return a value [-Wreturn-mismatch]
>      213 |                         return user_exc_vmm_communication(regs, error_code);
>          |                         ^      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>    arch/x86/entry/entry_fred.c:215:11: error: call to undeclared function 'kernel_exc_vmm_communication'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
>      215 |                         return kernel_exc_vmm_communication(regs, error_code);
>          |                                ^
>    arch/x86/entry/entry_fred.c:215:4: warning: void function 'fred_hwexc' should not return a value [-Wreturn-mismatch]
>      215 |                         return kernel_exc_vmm_communication(regs, error_code);
>          |                         ^      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>    2 warnings and 2 errors generated.
> 
> 

Thanks for the report, below patch should solve the problem:

diff --git a/arch/x86/entry/entry_fred.c b/arch/x86/entry/entry_fred.c
index 7a8659f19441..9946b9e96692 100644
--- a/arch/x86/entry/entry_fred.c
+++ b/arch/x86/entry/entry_fred.c
@@ -208,11 +208,15 @@ static noinstr void fred_hwexc(struct pt_regs *regs, unsigned long error_code)
 #ifdef CONFIG_X86_CET
 	case X86_TRAP_CP: return exc_control_protection(regs, error_code);
 #endif
+
+#ifdef CONFIG_AMD_MEM_ENCRYPT
 	case X86_TRAP_VC:
 		if (user_mode(regs))
 			return user_exc_vmm_communication(regs, error_code);
 		else
 			return kernel_exc_vmm_communication(regs, error_code);
+#endif
+
 	default: return fred_bad_type(regs, error_code);
 	}
 


Regards
Nikunj
Re: [PATCH] x86/fred: Fix early boot failures on SEV-ES/SNP guests
Posted by Xin Li 1 day, 5 hours ago

> On Feb 5, 2026, at 7:31 PM, Nikunj A. Dadhania <nikunj@amd.com> wrote:
> 
> if (user_mode(regs))
> return user_exc_vmm_communication(regs, error_code);
> else
> return kernel_exc_vmm_communication(regs, error_code);

Please rewrite this piece of code, like how X86_TRAP_DB is handled today.

Should kernel #VC be handled at a higher level RSP? You can check FRED #DB settings.
Re: [PATCH] x86/fred: Fix early boot failures on SEV-ES/SNP guests
Posted by Xin Li 2 days, 7 hours ago

> On Feb 4, 2026, at 9:10 PM, Nikunj A Dadhania <nikunj@amd.com> wrote:
> 
> FRED enabled SEV-ES and SNP guests fail to boot due to the following
> issues in the early boot sequence:
> 
> * FRED does not have a #VC exception handler in the dispatch logic


This should be a separate patch.


> 
> * For secondary CPUs, FRED is enabled before setting up the FRED MSRs, and
>  console output triggers a #VC which cannot be handled

Yes, this is a problem.  I ever looked into it for TDX, and had the following patch:

Can you please check if it works for you (#VC handler is set in the bringup IDT on AMD)?


    x86/smp: Set up exception handling before cr4_init()
    
    The current AP boot sequence initializes CR4 before setting up
    exception handling.  With FRED enabled, however, CR4.FRED is set
    prior to initializing the FRED configuration MSRs, introducing a
    brief window where a triple fault could occur.  This isn't
    considered a problem, as the early boot code is carefully designed
    to avoid triggering exceptions.  Moreover, if an exception does
    occur at this stage, it's preferable for the CPU to triple fault
    rather than risk a potential exploit.
    
    However, under TDX, printk() triggers a #VE, so any logging during
    this small window results in a triple fault.
    
    Swap the order of cr4_init() and cpu_init_exception_handling(),
    since cr4_init() only involves reading from and writing to CR4,
    and setting up exception handling does not depend on any specific
    CR4 bits being set (Arguably CR4.PAE, CR4.PSE and CR4.PGE are
    related but they are already set before start_secondary() anyway).
    
    Notably, this triple fault can still occur before FRED is enabled,
    while the bringup IDT is in use, since it lacks a #VE handler.
    
    BTW, on 32-bit systems, loading CR3 with swapper_pg_dir is moved
    ahead of cr4_init(), which appears to be harmless.
    
    Signed-off-by: Xin Li (Intel) <xin@zytor.com>

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index eb289abece23..24497258c16b 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -231,13 +231,6 @@ static void ap_calibrate_delay(void)
  */
 static void notrace __noendbr start_secondary(void *unused)
 {
-	/*
-	 * Don't put *anything* except direct CPU state initialization
-	 * before cpu_init(), SMP booting is too fragile that we want to
-	 * limit the things done here to the most necessary things.
-	 */
-	cr4_init();
-
 	/*
 	 * 32-bit specific. 64-bit reaches this code with the correct page
 	 * table established. Yet another historical divergence.
@@ -248,8 +241,37 @@ static void notrace __noendbr start_secondary(void *unused)
 		__flush_tlb_all();
 	}
 
+	/*
+	 * AP startup assembly code has setup the following before calling
+	 * start_secondary() on 64-bit:
+	 *
+	 * 1) CS set to __KERNEL_CS.
+	 * 2) CR3 switched to the init_top_pgt.
+	 * 3) CR4.PAE, CR4.PSE and CR4.PGE are set.
+	 * 4) GDT set to per-CPU gdt_page.
+	 * 5) ALL data segments set to the NULL descriptor.
+	 * 6) MSR_GS_BASE set to per-CPU offset.
+	 * 7) IDT set to bringup IDT.
+	 * 8) CR0 set to CR0_STATE.
+	 *
+	 * So it's ready to setup exception handling.
+	 */
 	cpu_init_exception_handling(false);
 
+	/*
+	 * Ensure bits set in cr4_pinned_bits are set in CR4.
+	 *
+	 * cr4_pinned_bits is a subset of cr4_pinned_mask, which includes
+	 * the following bits:
+	 *         X86_CR4_SMEP
+	 *         X86_CR4_SMAP
+	 *         X86_CR4_UMIP
+	 *         X86_CR4_FSGSBASE
+	 *         X86_CR4_CET
+	 *         X86_CR4_FRED
+	 */
+	cr4_init();
+
 	/*
 	 * Load the microcode before reaching the AP alive synchronization
 	 * point below so it is not part of the full per CPU serialized
@@ -275,6 +297,11 @@ static void notrace __noendbr start_secondary(void *unused)
 	 */
 	cpuhp_ap_sync_alive();
 
+	/*
+	 * Don't put *anything* except direct CPU state initialization
+	 * before cpu_init(), SMP booting is too fragile that we want to
+	 * limit the things done here to the most necessary things.
+	 */
 	cpu_init();
 	fpu__init_cpu();
 	rcutree_report_cpu_starting(raw_smp_processor_id());
Re: [PATCH] x86/fred: Fix early boot failures on SEV-ES/SNP guests
Posted by Nikunj A. Dadhania 2 days, 6 hours ago

On 2/5/2026 12:41 PM, Xin Li wrote:
>> On Feb 4, 2026, at 9:10 PM, Nikunj A Dadhania <nikunj@amd.com> wrote:

>>
>> * For secondary CPUs, FRED is enabled before setting up the FRED MSRs, and
>>  console output triggers a #VC which cannot be handled
> 
> Yes, this is a problem.  I ever looked into it for TDX, and had the following patch:
> 
> Can you please check if it works for you (#VC handler is set in the bringup IDT on AMD)?

Yes, this works as well. With your change that moves cr4_init(), I no longer
need my arch/x86/kernel/fred.c modification (moving pr_info() to avoid the #VC).
SEV-ES / SEV-SNP guests boot successfully with FRED enabled.

Are you planning to post this for inclusion?

Regards
Nikunj
 I 
> 
> 
>     x86/smp: Set up exception handling before cr4_init()
>     
>     The current AP boot sequence initializes CR4 before setting up
>     exception handling.  With FRED enabled, however, CR4.FRED is set
>     prior to initializing the FRED configuration MSRs, introducing a
>     brief window where a triple fault could occur.  This isn't
>     considered a problem, as the early boot code is carefully designed
>     to avoid triggering exceptions.  Moreover, if an exception does
>     occur at this stage, it's preferable for the CPU to triple fault
>     rather than risk a potential exploit.
>     
>     However, under TDX, printk() triggers a #VE, so any logging during
>     this small window results in a triple fault.
>     
>     Swap the order of cr4_init() and cpu_init_exception_handling(),
>     since cr4_init() only involves reading from and writing to CR4,
>     and setting up exception handling does not depend on any specific
>     CR4 bits being set (Arguably CR4.PAE, CR4.PSE and CR4.PGE are
>     related but they are already set before start_secondary() anyway).
>     
>     Notably, this triple fault can still occur before FRED is enabled,
>     while the bringup IDT is in use, since it lacks a #VE handler.
>     
>     BTW, on 32-bit systems, loading CR3 with swapper_pg_dir is moved
>     ahead of cr4_init(), which appears to be harmless.
>     
>     Signed-off-by: Xin Li (Intel) <xin@zytor.com>
> 
> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> index eb289abece23..24497258c16b 100644
> --- a/arch/x86/kernel/smpboot.c
> +++ b/arch/x86/kernel/smpboot.c
> @@ -231,13 +231,6 @@ static void ap_calibrate_delay(void)
>   */
>  static void notrace __noendbr start_secondary(void *unused)
>  {
> -	/*
> -	 * Don't put *anything* except direct CPU state initialization
> -	 * before cpu_init(), SMP booting is too fragile that we want to
> -	 * limit the things done here to the most necessary things.
> -	 */
> -	cr4_init();
> -
>  	/*
>  	 * 32-bit specific. 64-bit reaches this code with the correct page
>  	 * table established. Yet another historical divergence.
> @@ -248,8 +241,37 @@ static void notrace __noendbr start_secondary(void *unused)
>  		__flush_tlb_all();
>  	}
>  
> +	/*
> +	 * AP startup assembly code has setup the following before calling
> +	 * start_secondary() on 64-bit:
> +	 *
> +	 * 1) CS set to __KERNEL_CS.
> +	 * 2) CR3 switched to the init_top_pgt.
> +	 * 3) CR4.PAE, CR4.PSE and CR4.PGE are set.
> +	 * 4) GDT set to per-CPU gdt_page.
> +	 * 5) ALL data segments set to the NULL descriptor.
> +	 * 6) MSR_GS_BASE set to per-CPU offset.
> +	 * 7) IDT set to bringup IDT.
> +	 * 8) CR0 set to CR0_STATE.
> +	 *
> +	 * So it's ready to setup exception handling.
> +	 */
>  	cpu_init_exception_handling(false);
>  
> +	/*
> +	 * Ensure bits set in cr4_pinned_bits are set in CR4.
> +	 *
> +	 * cr4_pinned_bits is a subset of cr4_pinned_mask, which includes
> +	 * the following bits:
> +	 *         X86_CR4_SMEP
> +	 *         X86_CR4_SMAP
> +	 *         X86_CR4_UMIP
> +	 *         X86_CR4_FSGSBASE
> +	 *         X86_CR4_CET
> +	 *         X86_CR4_FRED
> +	 */
> +	cr4_init();
> +
>  	/*
>  	 * Load the microcode before reaching the AP alive synchronization
>  	 * point below so it is not part of the full per CPU serialized
> @@ -275,6 +297,11 @@ static void notrace __noendbr start_secondary(void *unused)
>  	 */
>  	cpuhp_ap_sync_alive();
>  
> +	/*
> +	 * Don't put *anything* except direct CPU state initialization
> +	 * before cpu_init(), SMP booting is too fragile that we want to
> +	 * limit the things done here to the most necessary things.
> +	 */
>  	cpu_init();
>  	fpu__init_cpu();
>  	rcutree_report_cpu_starting(raw_smp_processor_id());
> 

Re: [PATCH] x86/fred: Fix early boot failures on SEV-ES/SNP guests
Posted by Xin Li 2 days ago

> On Feb 5, 2026, at 12:54 AM, Nikunj A. Dadhania <nikunj@amd.com> wrote:
> 
>> 
>> Can you please check if it works for you (#VC handler is set in the bringup IDT on AMD)?
> 
> Yes, this works as well. With your change that moves cr4_init(), I no longer
> need my arch/x86/kernel/fred.c modification (moving pr_info() to avoid the #VC).
> SEV-ES / SEV-SNP guests boot successfully with FRED enabled.
> 
> Are you planning to post this for inclusion?

Yes, I will.
Re: [PATCH] x86/fred: Fix early boot failures on SEV-ES/SNP guests
Posted by Greg KH 2 days, 9 hours ago
On Thu, Feb 05, 2026 at 05:10:30AM +0000, Nikunj A Dadhania wrote:
> @@ -70,6 +67,17 @@ void cpu_init_fred_exceptions(void)
>  	/* Use int $0x80 for 32-bit system calls in FRED mode */
>  	setup_clear_cpu_cap(X86_FEATURE_SYSFAST32);
>  	setup_clear_cpu_cap(X86_FEATURE_SYSCALL32);
> +
> +	/*
> +	 * For secondary processors, FRED bit in CR4 gets enabled in cr4_init()
> +	 * and FRED MSRs are not configured till the end of this function. For
> +	 * SEV-ES and SNP guests, any console write before the FRED MSRs are
> +	 * setup will cause a #VC and cannot be handled. Move the pr_info to
> +	 * the end of this function.
> +	 *
> +	 * When FRED is enabled by default, remove this log message
> +	 */
> +	pr_info("Initialized FRED on CPU%d\n", smp_processor_id());

Did you forget to fix this up?

Also, when the kernel is working properly, it is quiet, so why is this
log message needed?

thanks,

greg k-h
Re: [PATCH] x86/fred: Fix early boot failures on SEV-ES/SNP guests
Posted by Nikunj A. Dadhania 2 days, 8 hours ago

On 2/5/2026 11:26 AM, Greg KH wrote:
> On Thu, Feb 05, 2026 at 05:10:30AM +0000, Nikunj A Dadhania wrote:
>> @@ -70,6 +67,17 @@ void cpu_init_fred_exceptions(void)
>>  	/* Use int $0x80 for 32-bit system calls in FRED mode */
>>  	setup_clear_cpu_cap(X86_FEATURE_SYSFAST32);
>>  	setup_clear_cpu_cap(X86_FEATURE_SYSCALL32);
>> +
>> +	/*
>> +	 * For secondary processors, FRED bit in CR4 gets enabled in cr4_init()
>> +	 * and FRED MSRs are not configured till the end of this function. For
>> +	 * SEV-ES and SNP guests, any console write before the FRED MSRs are
>> +	 * setup will cause a #VC and cannot be handled. Move the pr_info to
>> +	 * the end of this function.
>> +	 *
>> +	 * When FRED is enabled by default, remove this log message
>> +	 */
>> +	pr_info("Initialized FRED on CPU%d\n", smp_processor_id());
> 
> Did you forget to fix this up?

I didn't forget, I have moved the message to the end of cpu_init_fred_exceptions()
because the original placement triggered #VC exceptions on SEV-ES/SNP guests before
FRED MSRs were configured, causing boot failures.

> Also, when the kernel is working properly, it is quiet, so why is this
> log message needed?
> 
> thanks,
> 
> greg k-h
Re: [PATCH] x86/fred: Fix early boot failures on SEV-ES/SNP guests
Posted by Greg KH 2 days, 9 hours ago
On Thu, Feb 05, 2026 at 05:10:30AM +0000, Nikunj A Dadhania wrote:
> FRED enabled SEV-ES and SNP guests fail to boot due to the following
> issues in the early boot sequence:
> 
> * FRED does not have a #VC exception handler in the dispatch logic
> 
> * For secondary CPUs, FRED is enabled before setting up the FRED MSRs, and
>   console output triggers a #VC which cannot be handled
> 
> * Early FRED #VC exceptions should use boot_ghcb until per-CPU GHCBs are
>   initialized
> 
> Fix these issues to ensure SEV-ES/SNP guests can handle #VC exceptions
> correctly during early boot when FRED is enabled.
> 
> Fixes: 14619d912b65 ("x86/fred: FRED entry/exit and dispatch code")
> Cc: stable@vger.kernel.org # 6.9+
> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
> ---
> 
> Reason to add stable tag:
> 
> With FRED support for SVM here 
> https://lore.kernel.org/kvm/20260129063653.3553076-1-shivansh.dhiman@amd.com,
> SVM and SEV guests running 6.9 and later kernels will support FRED.
> However, *SEV-ES and SNP guests cannot support FRED* and will fail to boot
> with the following error:
> 
>     [    0.005144] Using GB pages for direct mapping
>     [    0.008402] Initialize FRED on CPU0
>     qemu-system-x86_64: cpus are not resettable, terminating
> 
> Three problems were identified as detailed in the commit message above and
> is fixed with this patch.
> 
> I would like the patch to be backported to the LTS kernels (6.12 and 6.18) to
> ensure SEV-ES and SNP guests running these stable kernel versions can boot
> with FRED enabled on FRED-enabled hypervisors.

That sounds like new hardware support, if you really want that, why not
just use newer kernel versions with this fix in it?  Obviously no one is
running those kernels on that hardware today, so this isn't a regression :)

thanks,

greg k-h
Re: [PATCH] x86/fred: Fix early boot failures on SEV-ES/SNP guests
Posted by Nikunj A. Dadhania 2 days, 8 hours ago

On 2/5/2026 11:25 AM, Greg KH wrote:
> On Thu, Feb 05, 2026 at 05:10:30AM +0000, Nikunj A Dadhania wrote:
>> FRED enabled SEV-ES and SNP guests fail to boot due to the following
>> issues in the early boot sequence:
>>
>> * FRED does not have a #VC exception handler in the dispatch logic
>>
>> * For secondary CPUs, FRED is enabled before setting up the FRED MSRs, and
>>   console output triggers a #VC which cannot be handled
>>
>> * Early FRED #VC exceptions should use boot_ghcb until per-CPU GHCBs are
>>   initialized
>>
>> Fix these issues to ensure SEV-ES/SNP guests can handle #VC exceptions
>> correctly during early boot when FRED is enabled.
>>
>> Fixes: 14619d912b65 ("x86/fred: FRED entry/exit and dispatch code")
>> Cc: stable@vger.kernel.org # 6.9+
>> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
>> ---
>>
>> Reason to add stable tag:
>>
>> With FRED support for SVM here 
>> https://lore.kernel.org/kvm/20260129063653.3553076-1-shivansh.dhiman@amd.com,
>> SVM and SEV guests running 6.9 and later kernels will support FRED.
>> However, *SEV-ES and SNP guests cannot support FRED* and will fail to boot
>> with the following error:
>>
>>     [    0.005144] Using GB pages for direct mapping
>>     [    0.008402] Initialize FRED on CPU0
>>     qemu-system-x86_64: cpus are not resettable, terminating
>>
>> Three problems were identified as detailed in the commit message above and
>> is fixed with this patch.
>>
>> I would like the patch to be backported to the LTS kernels (6.12 and 6.18) to
>> ensure SEV-ES and SNP guests running these stable kernel versions can boot
>> with FRED enabled on FRED-enabled hypervisors.
> 
> That sounds like new hardware support, if you really want that, why not
> just use newer kernel versions with this fix in it?  Obviously no one is
> running those kernels on that hardware today, so this isn't a regression :)

Fair point.

However, the situation is a bit nuanced: FRED hardware is available now, and
users running current stable kernels as guests will encounter boot
failures when the hypervisor is updated to support FRED. While not a traditional
regression, it creates a compatibility gap where stable guest kernels cannot run
on updated hypervisors.

Other option would be to disable FRED for SEV-ES and SNP guest in stable kernel.

Regards
Nikunj
Re: [PATCH] x86/fred: Fix early boot failures on SEV-ES/SNP guests
Posted by Greg KH 2 days, 8 hours ago
On Thu, Feb 05, 2026 at 11:40:11AM +0530, Nikunj A. Dadhania wrote:
> 
> 
> On 2/5/2026 11:25 AM, Greg KH wrote:
> > On Thu, Feb 05, 2026 at 05:10:30AM +0000, Nikunj A Dadhania wrote:
> >> FRED enabled SEV-ES and SNP guests fail to boot due to the following
> >> issues in the early boot sequence:
> >>
> >> * FRED does not have a #VC exception handler in the dispatch logic
> >>
> >> * For secondary CPUs, FRED is enabled before setting up the FRED MSRs, and
> >>   console output triggers a #VC which cannot be handled
> >>
> >> * Early FRED #VC exceptions should use boot_ghcb until per-CPU GHCBs are
> >>   initialized
> >>
> >> Fix these issues to ensure SEV-ES/SNP guests can handle #VC exceptions
> >> correctly during early boot when FRED is enabled.
> >>
> >> Fixes: 14619d912b65 ("x86/fred: FRED entry/exit and dispatch code")
> >> Cc: stable@vger.kernel.org # 6.9+
> >> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
> >> ---
> >>
> >> Reason to add stable tag:
> >>
> >> With FRED support for SVM here 
> >> https://lore.kernel.org/kvm/20260129063653.3553076-1-shivansh.dhiman@amd.com,
> >> SVM and SEV guests running 6.9 and later kernels will support FRED.
> >> However, *SEV-ES and SNP guests cannot support FRED* and will fail to boot
> >> with the following error:
> >>
> >>     [    0.005144] Using GB pages for direct mapping
> >>     [    0.008402] Initialize FRED on CPU0
> >>     qemu-system-x86_64: cpus are not resettable, terminating
> >>
> >> Three problems were identified as detailed in the commit message above and
> >> is fixed with this patch.
> >>
> >> I would like the patch to be backported to the LTS kernels (6.12 and 6.18) to
> >> ensure SEV-ES and SNP guests running these stable kernel versions can boot
> >> with FRED enabled on FRED-enabled hypervisors.
> > 
> > That sounds like new hardware support, if you really want that, why not
> > just use newer kernel versions with this fix in it?  Obviously no one is
> > running those kernels on that hardware today, so this isn't a regression :)
> 
> Fair point.
> 
> However, the situation is a bit nuanced: FRED hardware is available now, and
> users running current stable kernels as guests will encounter boot
> failures when the hypervisor is updated to support FRED. While not a traditional
> regression, it creates a compatibility gap where stable guest kernels cannot run
> on updated hypervisors.

Great, then upgrade those guest kernels as they have never been able to
run on those hypervisors :)

> Other option would be to disable FRED for SEV-ES and SNP guest in stable kernel.

That's a choice for the hypervisor vendors to choose.

thanks,

greg k-h
Re: [PATCH] x86/fred: Fix early boot failures on SEV-ES/SNP guests
Posted by Sean Christopherson 1 day, 23 hours ago
On Thu, Feb 05, 2026, Greg KH wrote:
> On Thu, Feb 05, 2026 at 11:40:11AM +0530, Nikunj A. Dadhania wrote:
> > 
> > 
> > On 2/5/2026 11:25 AM, Greg KH wrote:
> > > On Thu, Feb 05, 2026 at 05:10:30AM +0000, Nikunj A Dadhania wrote:
> > >> FRED enabled SEV-ES and SNP guests fail to boot due to the following
> > >> issues in the early boot sequence:
> > >>
> > >> * FRED does not have a #VC exception handler in the dispatch logic
> > >>
> > >> * For secondary CPUs, FRED is enabled before setting up the FRED MSRs, and
> > >>   console output triggers a #VC which cannot be handled
> > >>
> > >> * Early FRED #VC exceptions should use boot_ghcb until per-CPU GHCBs are
> > >>   initialized
> > >>
> > >> Fix these issues to ensure SEV-ES/SNP guests can handle #VC exceptions
> > >> correctly during early boot when FRED is enabled.
> > >>
> > >> Fixes: 14619d912b65 ("x86/fred: FRED entry/exit and dispatch code")
> > >> Cc: stable@vger.kernel.org # 6.9+
> > >> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
> > >> ---
> > >>
> > >> Reason to add stable tag:
> > >>
> > >> With FRED support for SVM here 
> > >> https://lore.kernel.org/kvm/20260129063653.3553076-1-shivansh.dhiman@amd.com,
> > >> SVM and SEV guests running 6.9 and later kernels will support FRED.
> > >> However, *SEV-ES and SNP guests cannot support FRED* and will fail to boot
> > >> with the following error:
> > >>
> > >>     [    0.005144] Using GB pages for direct mapping
> > >>     [    0.008402] Initialize FRED on CPU0
> > >>     qemu-system-x86_64: cpus are not resettable, terminating
> > >>
> > >> Three problems were identified as detailed in the commit message above and
> > >> is fixed with this patch.
> > >>
> > >> I would like the patch to be backported to the LTS kernels (6.12 and 6.18) to
> > >> ensure SEV-ES and SNP guests running these stable kernel versions can boot
> > >> with FRED enabled on FRED-enabled hypervisors.
> > > 
> > > That sounds like new hardware support, if you really want that, why not
> > > just use newer kernel versions with this fix in it?  Obviously no one is
> > > running those kernels on that hardware today, so this isn't a regression :)

I disagree, this absolutely is a regression.  Kernels without commit 14619d912b65
will boot on this "new" hardware, kernels with the commit will not.

> > Fair point.
> > 
> > However, the situation is a bit nuanced: FRED hardware is available now, and
> > users running current stable kernels as guests will encounter boot
> > failures when the hypervisor is updated to support FRED. While not a traditional
> > regression, it creates a compatibility gap where stable guest kernels cannot run
> > on updated hypervisors.
> 
> Great, then upgrade those guest kernels as they have never been able to
> run on those hypervisors :)

As above, *upgrading* from e.g. 6.6 to 6.12 will suddenly fail to boot.

> > Other option would be to disable FRED for SEV-ES and SNP guest in stable kernel.
> 
> That's a choice for the hypervisor vendors to choose.

No, because the hypervisor has no clue what kernel version the guest is running.
Re: [PATCH] x86/fred: Fix early boot failures on SEV-ES/SNP guests
Posted by Greg KH 1 day, 23 hours ago
On Thu, Feb 05, 2026 at 07:50:09AM -0800, Sean Christopherson wrote:
> On Thu, Feb 05, 2026, Greg KH wrote:
> > On Thu, Feb 05, 2026 at 11:40:11AM +0530, Nikunj A. Dadhania wrote:
> > > 
> > > 
> > > On 2/5/2026 11:25 AM, Greg KH wrote:
> > > > On Thu, Feb 05, 2026 at 05:10:30AM +0000, Nikunj A Dadhania wrote:
> > > >> FRED enabled SEV-ES and SNP guests fail to boot due to the following
> > > >> issues in the early boot sequence:
> > > >>
> > > >> * FRED does not have a #VC exception handler in the dispatch logic
> > > >>
> > > >> * For secondary CPUs, FRED is enabled before setting up the FRED MSRs, and
> > > >>   console output triggers a #VC which cannot be handled
> > > >>
> > > >> * Early FRED #VC exceptions should use boot_ghcb until per-CPU GHCBs are
> > > >>   initialized
> > > >>
> > > >> Fix these issues to ensure SEV-ES/SNP guests can handle #VC exceptions
> > > >> correctly during early boot when FRED is enabled.
> > > >>
> > > >> Fixes: 14619d912b65 ("x86/fred: FRED entry/exit and dispatch code")
> > > >> Cc: stable@vger.kernel.org # 6.9+
> > > >> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
> > > >> ---
> > > >>
> > > >> Reason to add stable tag:
> > > >>
> > > >> With FRED support for SVM here 
> > > >> https://lore.kernel.org/kvm/20260129063653.3553076-1-shivansh.dhiman@amd.com,
> > > >> SVM and SEV guests running 6.9 and later kernels will support FRED.
> > > >> However, *SEV-ES and SNP guests cannot support FRED* and will fail to boot
> > > >> with the following error:
> > > >>
> > > >>     [    0.005144] Using GB pages for direct mapping
> > > >>     [    0.008402] Initialize FRED on CPU0
> > > >>     qemu-system-x86_64: cpus are not resettable, terminating
> > > >>
> > > >> Three problems were identified as detailed in the commit message above and
> > > >> is fixed with this patch.
> > > >>
> > > >> I would like the patch to be backported to the LTS kernels (6.12 and 6.18) to
> > > >> ensure SEV-ES and SNP guests running these stable kernel versions can boot
> > > >> with FRED enabled on FRED-enabled hypervisors.
> > > > 
> > > > That sounds like new hardware support, if you really want that, why not
> > > > just use newer kernel versions with this fix in it?  Obviously no one is
> > > > running those kernels on that hardware today, so this isn't a regression :)
> 
> I disagree, this absolutely is a regression.  Kernels without commit 14619d912b65
> will boot on this "new" hardware, kernels with the commit will not.

That commit added the new FRED feature, which "broke" when it hits real
hardware.  Not really a "regression" in my opinion as obviously it never
worked at all :)

Anyway, I'll let you x86 maintainers here hash that out, just my
thoughts...

thanks,

greg k-h
Re: [PATCH] x86/fred: Fix early boot failures on SEV-ES/SNP guests
Posted by Dave Hansen 1 day, 23 hours ago
On 2/5/26 07:50, Sean Christopherson wrote:
>>>> That sounds like new hardware support, if you really want that, why not
>>>> just use newer kernel versions with this fix in it?  Obviously no one is
>>>> running those kernels on that hardware today, so this isn't a regression 🙂
> I disagree, this absolutely is a regression.  Kernels without commit 14619d912b65
> will boot on this "new" hardware, kernels with the commit will not.

Yeah, it is a regression for sure. It's a weird one, but it is a regression.

We need to either disable FRED on SEV-ES/SNP+FRED systems or fix it. We
obviously want to fix it in mainline. I guess stable could do something
different here and disable FRED instead if it wanted. That would avoid
even a whiff of appearing to add new hardware support.

I'd personally prefer to just keep stable and mainline as close as possible.

P.S. #VC and #VE are a scourge