[PATCH] x86/efi: Remove NX check from efi-boot.h

Julian Vetter posted 1 patch 2 weeks, 2 days ago
Patches applied successfully (tree, apply log)
git fetch https://gitlab.com/xen-project/patchew/xen tags/patchew/20251127143136.1598354-1-julian.vetter@vates.tech
xen/arch/x86/efi/efi-boot.h | 12 ------------
1 file changed, 12 deletions(-)
[PATCH] x86/efi: Remove NX check from efi-boot.h
Posted by Julian Vetter 2 weeks, 2 days ago
Currently Intel CPUs in EFI mode with the "Execute Disable Bit" disabled
and the 'CONFIG_REQUIRE_NX=y' fail to boot, because this check is
performed before trampoline_setup is called, which determines if NX is
supported or if it's hidden by 'MSR_IA32_MISC_ENABLE[34] = 1' (if so,
re-enables NX).

Signed-off-by: Julian Vetter <julian.vetter@vates.tech>
---
 xen/arch/x86/efi/efi-boot.h | 12 ------------
 1 file changed, 12 deletions(-)

diff --git a/xen/arch/x86/efi/efi-boot.h b/xen/arch/x86/efi/efi-boot.h
index 0194720003..8dfd549f12 100644
--- a/xen/arch/x86/efi/efi-boot.h
+++ b/xen/arch/x86/efi/efi-boot.h
@@ -748,18 +748,6 @@ static void __init efi_arch_cpu(void)
     if ( (eax >> 16) == 0x8000 && eax > 0x80000000U )
     {
         caps[FEATURESET_e1d] = cpuid_edx(0x80000001U);
-
-        /*
-         * This check purposefully doesn't use cpu_has_nx because
-         * cpu_has_nx bypasses the boot_cpu_data read if Xen was compiled
-         * with CONFIG_REQUIRE_NX
-         */
-        if ( IS_ENABLED(CONFIG_REQUIRE_NX) &&
-             !boot_cpu_has(X86_FEATURE_NX) )
-            blexit(L"This build of Xen requires NX support");
-
-        if ( cpu_has_nx )
-            trampoline_efer |= EFER_NXE;
     }
 }
 
-- 
2.51.0



--
Julian Vetter | Vates Hypervisor & Kernel Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech
Re: [PATCH] x86/efi: Remove NX check from efi-boot.h
Posted by Teddy Astie 2 weeks, 2 days ago
Le 27/11/2025 à 15:33, Julian Vetter a écrit :
> Currently Intel CPUs in EFI mode with the "Execute Disable Bit" disabled
> and the 'CONFIG_REQUIRE_NX=y' fail to boot, because this check is
> performed before trampoline_setup is called, which determines if NX is
> supported or if it's hidden by 'MSR_IA32_MISC_ENABLE[34] = 1' (if so,
> re-enables NX).
> 
> Signed-off-by: Julian Vetter <julian.vetter@vates.tech>
> ---
>   xen/arch/x86/efi/efi-boot.h | 12 ------------
>   1 file changed, 12 deletions(-)
> 
> diff --git a/xen/arch/x86/efi/efi-boot.h b/xen/arch/x86/efi/efi-boot.h
> index 0194720003..8dfd549f12 100644
> --- a/xen/arch/x86/efi/efi-boot.h
> +++ b/xen/arch/x86/efi/efi-boot.h
> @@ -748,18 +748,6 @@ static void __init efi_arch_cpu(void)
>       if ( (eax >> 16) == 0x8000 && eax > 0x80000000U )
>       {
>           caps[FEATURESET_e1d] = cpuid_edx(0x80000001U);
> -
> -        /*
> -         * This check purposefully doesn't use cpu_has_nx because
> -         * cpu_has_nx bypasses the boot_cpu_data read if Xen was compiled
> -         * with CONFIG_REQUIRE_NX
> -         */
> -        if ( IS_ENABLED(CONFIG_REQUIRE_NX) &&
> -             !boot_cpu_has(X86_FEATURE_NX) )
> -            blexit(L"This build of Xen requires NX support");
> -
> -        if ( cpu_has_nx )
> -            trampoline_efer |= EFER_NXE;

I don't think we want to skip setting EFER_NXE. As it would mean not 
using NX at all (unless I missed something).

If cpu_policy doesn't have nx, it is likely going to cause issues e.g in 
VMs which will not see NX and potentially refuse to boot. I don't really 
know in which order things are initialized, but it probably wants to be 
considered.

Perhaps, we want to do something like detecting the 
MSR_IA32_MISC_ENABLE[34] then adjusting the cpu_policy appropriately 
after patching it ?

>       }
>   }
>   



--
Teddy Astie | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech
Re: [PATCH] x86/efi: Remove NX check from efi-boot.h
Posted by Julian Vetter 2 weeks, 1 day ago

On 11/27/25 16:33, Teddy Astie wrote:
> Le 27/11/2025 à 15:33, Julian Vetter a écrit :
>> Currently Intel CPUs in EFI mode with the "Execute Disable Bit" disabled
>> and the 'CONFIG_REQUIRE_NX=y' fail to boot, because this check is
>> performed before trampoline_setup is called, which determines if NX is
>> supported or if it's hidden by 'MSR_IA32_MISC_ENABLE[34] = 1' (if so,
>> re-enables NX).
>>
>> Signed-off-by: Julian Vetter <julian.vetter@vates.tech>
>> ---
>>    xen/arch/x86/efi/efi-boot.h | 12 ------------
>>    1 file changed, 12 deletions(-)
>>
>> diff --git a/xen/arch/x86/efi/efi-boot.h b/xen/arch/x86/efi/efi-boot.h
>> index 0194720003..8dfd549f12 100644
>> --- a/xen/arch/x86/efi/efi-boot.h
>> +++ b/xen/arch/x86/efi/efi-boot.h
>> @@ -748,18 +748,6 @@ static void __init efi_arch_cpu(void)
>>        if ( (eax >> 16) == 0x8000 && eax > 0x80000000U )
>>        {
>>            caps[FEATURESET_e1d] = cpuid_edx(0x80000001U);
>> -
>> -        /*
>> -         * This check purposefully doesn't use cpu_has_nx because
>> -         * cpu_has_nx bypasses the boot_cpu_data read if Xen was compiled
>> -         * with CONFIG_REQUIRE_NX
>> -         */
>> -        if ( IS_ENABLED(CONFIG_REQUIRE_NX) &&
>> -             !boot_cpu_has(X86_FEATURE_NX) )
>> -            blexit(L"This build of Xen requires NX support");
>> -
>> -        if ( cpu_has_nx )
>> -            trampoline_efer |= EFER_NXE;
> 
> I don't think we want to skip setting EFER_NXE. As it would mean not
> using NX at all (unless I missed something).
> 

Yes, I though the code in trampoline_setup is taken in any case. Because 
at the label .Lgot_nx the EFER_NXE is set. But Andrew said that this is 
not always the case, then you're right this should be kept.

> If cpu_policy doesn't have nx, it is likely going to cause issues e.g in
> VMs which will not see NX and potentially refuse to boot. I don't really
> know in which order things are initialized, but it probably wants to be
> considered.
> 
> Perhaps, we want to do something like detecting the
> MSR_IA32_MISC_ENABLE[34] then adjusting the cpu_policy appropriately
> after patching it ?
> 

yes, I was wondering if we couldn't do the check for 
MSR_IA32_MISC_ENABLE[34] == 1 directly in the efi_arch_cpu().

>>        }
>>    }
>>    
> 
> 
> 
> --
> Teddy Astie | Vates XCP-ng Developer
> 
> XCP-ng & Xen Orchestra - Vates solutions
> 
> web: https://vates.tech



--
Julian Vetter | Vates Hypervisor & Kernel Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech
Re: [PATCH] x86/efi: Remove NX check from efi-boot.h
Posted by Andrew Cooper 2 weeks, 2 days ago
On 27/11/2025 2:31 pm, Julian Vetter wrote:
> Currently Intel CPUs in EFI mode with the "Execute Disable Bit" disabled
> and the 'CONFIG_REQUIRE_NX=y' fail to boot, because this check is
> performed before trampoline_setup is called, which determines if NX is
> supported or if it's hidden by 'MSR_IA32_MISC_ENABLE[34] = 1' (if so,
> re-enables NX).
>
> Signed-off-by: Julian Vetter <julian.vetter@vates.tech>

Lovely...  This isn't the only bug; there's another one from the Vates
forums about AMD CPUs which I haven't gotten around to fixing yet.

Do you have any more information about which system looks like this?

trampoline_setup isn't executed on all EFI boots.  I had a different fix
in mind, but it's a little more complicated.

If I do the key prep patch, would you mind trying to tackle the AMD side
too?

~Andrew

Re: [PATCH] x86/efi: Remove NX check from efi-boot.h
Posted by Julian Vetter 2 weeks, 1 day ago
On 11/27/25 16:20, Andrew Cooper wrote:
> On 27/11/2025 2:31 pm, Julian Vetter wrote:
>> Currently Intel CPUs in EFI mode with the "Execute Disable Bit" disabled
>> and the 'CONFIG_REQUIRE_NX=y' fail to boot, because this check is
>> performed before trampoline_setup is called, which determines if NX is
>> supported or if it's hidden by 'MSR_IA32_MISC_ENABLE[34] = 1' (if so,
>> re-enables NX).
>>
>> Signed-off-by: Julian Vetter <julian.vetter@vates.tech>
> 
> Lovely...  This isn't the only bug; there's another one from the Vates
> forums about AMD CPUs which I haven't gotten around to fixing yet.
> 

Thank you. I will have a look. I haven't seen this thread.

> Do you have any more information about which system looks like this?
> 
I'm not sure if I understand your question correctly, but I was just 
booting an Intel based machine newer than ~2012. I have tested this on 4 
different machines, on which 3 hit this code path. One was a HPE 
ProLiant m510 Server with a XEON CPU, second was a Mini PC with Celeron 
CPU, and third was an old Intel NUC DCCP847DYE also with a Celeron CPU. 
The only system where I couldn't reproduce the issue was an old 
workstation with a Gigabyte mainboard. It has the flag in the Bios to 
set MSR_IA32_MISC_ENABLE, but I'm not sure if it was actually booting 
via UEFI. I will verify this on monday. I booted all the 3 other systems 
via UEFI -> Grub -> multiboot2. My grub entry looks like this:

multiboot2 /boot/xen.gz dom0_mem=2656M,max:2656M watchdog ucode=scan 
dom0_max_vcpus=1-8 crashkernel=256M,below=4G console=vga vga=mode-0x0311
module2 boot/vmlinuz console=hvc0 console=tty0 init=/bin/sh
module2 boot/initrd-dom0

> trampoline_setup isn't executed on all EFI boots.  I had a different fix
> in mind, but it's a little more complicated.

Aha. yes, I didn't thought about other code paths. But I'm wondering if 
we couldn't do the whole dance with XD and NX directly in the 
efi-boot.h. Then maybe we could even remove the part in trampoline_setup 
or are there other systems that hit only the trampoline_setup but not 
the efi_multiboot2?

@@ -747,16 +748,27 @@ static void __init efi_arch_cpu(void)

      if ( (eax >> 16) == 0x8000 && eax > 0x80000000U )
      {
-        caps[FEATURESET_e1d] = cpuid_edx(0x80000001U);
-
          /*
           * This check purposefully doesn't use cpu_has_nx because
           * cpu_has_nx bypasses the boot_cpu_data read if Xen was compiled
           * with CONFIG_REQUIRE_NX
           */
-        if ( IS_ENABLED(CONFIG_REQUIRE_NX) &&
-             !boot_cpu_has(X86_FEATURE_NX) )
-            blexit(L"This build of Xen requires NX support");
+        if (IS_ENABLED(CONFIG_REQUIRE_NX)) {
+
+            msr_ia32 = rdmsr(MSR_IA32_MISC_ENABLE);
+            /* NX is hidden */
+            if (msr_ia32 & MSR_IA32_MISC_ENABLE_XD_DISABLE) {
+                msr_ia32 &= ~MSR_IA32_MISC_ENABLE_XD_DISABLE;
+
+                wrmsr(MSR_IA32_MISC_ENABLE_XD_DISABLE, msr_ia32);
+
+                /* Re-check for NX */
+                caps[FEATURESET_e1d] = cpuid_edx(0x80000001U);
+            }
+
+            if (!boot_cpu_has(X86_FEATURE_NX))
+                blexit(L"This build of Xen requires NX support");
+        }

          if ( cpu_has_nx )
              trampoline_efer |= EFER_NXE;


> 
> If I do the key prep patch, would you mind trying to tackle the AMD side
> too?
Yes of course. I will have a look into it. Thank you.

> 
> ~Andrew



--
Julian Vetter | Vates Hypervisor & Kernel Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech
Re: [PATCH] x86/efi: Remove NX check from efi-boot.h
Posted by Andrew Cooper 2 weeks, 1 day ago
On 28/11/2025 1:19 pm, Julian Vetter wrote:
> On 11/27/25 16:20, Andrew Cooper wrote:
>> On 27/11/2025 2:31 pm, Julian Vetter wrote:
>>> Currently Intel CPUs in EFI mode with the "Execute Disable Bit" disabled
>>> and the 'CONFIG_REQUIRE_NX=y' fail to boot, because this check is
>>> performed before trampoline_setup is called, which determines if NX is
>>> supported or if it's hidden by 'MSR_IA32_MISC_ENABLE[34] = 1' (if so,
>>> re-enables NX).
>>>
>>> Signed-off-by: Julian Vetter <julian.vetter@vates.tech>
>> Lovely...  This isn't the only bug; there's another one from the Vates
>> forums about AMD CPUs which I haven't gotten around to fixing yet.
>>
> Thank you. I will have a look. I haven't seen this thread.

https://xcp-ng.org/forum/post/80714

But the tl;dr is that AMD have introduced a firmware option to disable
NX.  Unlike Intel, there's no positive way to know you've reactivated it.

A conversation with AMD has revealed that there's no capability to
prevent setting EFER.NXE, and that NX is always available in practice. 
I'm pretty sure the firmware is just clearing NX in the CPUID Override MSR.

However, to reactivate this safely, we need to do a wrmsr_safe(), which
means we need to delay setting NXE until exception handling is available
which is rather later on boot.  There's also a tangle with the
order-of-initialisation of the CPUID Override MSRs which I found
recently while doing something else.

The other observation is that, even on a STRICT_NX build of Xen, we can
boot into __start_xen() because we can't insert NX into the pagetables
that early.  In fact it's quite late that we lock down permissions; see
the calls to modify_xen_mappings() in __start_xen().

Given that we need to be this late for AMD, we can also move the Intel
logic later (effectively reverts part of the original work; sorry
Alejandro) which means we can also use safe accessors, and we don't need
to worry about the divergent early paths.

>
>> Do you have any more information about which system looks like this?
>>
> I'm not sure if I understand your question correctly, but I was just 
> booting an Intel based machine newer than ~2012. I have tested this on 4 
> different machines, on which 3 hit this code path. One was a HPE 
> ProLiant m510 Server with a XEON CPU

Broadwell.

> , second was a Mini PC with Celeron CPU,

Sorry, not enough information here to figure out the microarchitecture.

> and third was an old Intel NUC DCCP847DYE also with a Celeron CPU. 

Sandy Bridge.

> The only system where I couldn't reproduce the issue was an old 
> workstation with a Gigabyte mainboard. It has the flag in the Bios to 
> set MSR_IA32_MISC_ENABLE, but I'm not sure if it was actually booting 
> via UEFI.

Same, not enough information here.

But, it's clear that Intel's XD-disable is still honoured in EFI mode on
a wide range of systems, and that we need a fix for UEFI.

>  I will verify this on monday. I booted all the 3 other systems 
> via UEFI -> Grub -> multiboot2. My grub entry looks like this:
>
> multiboot2 /boot/xen.gz dom0_mem=2656M,max:2656M watchdog ucode=scan 
> dom0_max_vcpus=1-8 crashkernel=256M,below=4G console=vga vga=mode-0x0311
> module2 boot/vmlinuz console=hvc0 console=tty0 init=/bin/sh
> module2 boot/initrd-dom0
>
>> trampoline_setup isn't executed on all EFI boots.  I had a different fix
>> in mind, but it's a little more complicated.
> Aha. yes, I didn't thought about other code paths.

https://xenbits.xen.org/docs/latest/hypervisor-guide/x86/how-xen-boots.html

Here's something I put together to cover some of these details.  But,
most of the detail is in the source only.

~Andrew

Re: [PATCH] x86/efi: Remove NX check from efi-boot.h
Posted by Julian Vetter 1 week, 5 days ago
On 11/28/25 7:25 PM, Andrew Cooper wrote:
> On 28/11/2025 1:19 pm, Julian Vetter wrote:
>> On 11/27/25 16:20, Andrew Cooper wrote:
>>> On 27/11/2025 2:31 pm, Julian Vetter wrote:
>>>> Currently Intel CPUs in EFI mode with the "Execute Disable Bit" disabled
>>>> and the 'CONFIG_REQUIRE_NX=y' fail to boot, because this check is
>>>> performed before trampoline_setup is called, which determines if NX is
>>>> supported or if it's hidden by 'MSR_IA32_MISC_ENABLE[34] = 1' (if so,
>>>> re-enables NX).
>>>>
>>>> Signed-off-by: Julian Vetter <julian.vetter@vates.tech>
>>> Lovely...  This isn't the only bug; there's another one from the Vates
>>> forums about AMD CPUs which I haven't gotten around to fixing yet.
>>>
>> Thank you. I will have a look. I haven't seen this thread.
> 
> https://xcp-ng.org/forum/post/80714
> 
> But the tl;dr is that AMD have introduced a firmware option to disable
> NX.  Unlike Intel, there's no positive way to know you've reactivated it.
> 

Yes, I found the thread. Thank you. I have now tested on 3 different AMD 
machines as well:
* Laptop with an Ryzen 9 5900HX -> XN not exposed via BIOS
* Beelink with AMD Ryzen 7 6800H-> XN exposed via BIOS -> XEN fails to 
boot if disabled
* Workstation with AMD Ryzen 5 7600 -> XN exposed via BIOS -> XEN fails 
to boot if disabled

again, I booted all machines via efi -> grub -> multiboot2 and I end up 
in the same "This build of Xen requires NX support", as on the Intel 
machines. But, as you explained the code path is the same, and the check 
even later than on Intel machines. So, no surprise.

> A conversation with AMD has revealed that there's no capability to
> prevent setting EFER.NXE, and that NX is always available in practice.
> I'm pretty sure the firmware is just clearing NX in the CPUID Override MSR.
> 
> However, to reactivate this safely, we need to do a wrmsr_safe(), which
> means we need to delay setting NXE until exception handling is available
> which is rather later on boot.  There's also a tangle with the
> order-of-initialisation of the CPUID Override MSRs which I found
> recently while doing something else.
> 
> The other observation is that, even on a STRICT_NX build of Xen, we can
> boot into __start_xen() because we can't insert NX into the pagetables
> that early.  In fact it's quite late that we lock down permissions; see
> the calls to modify_xen_mappings() in __start_xen().
> 
> Given that we need to be this late for AMD, we can also move the Intel
> logic later (effectively reverts part of the original work; sorry
> Alejandro) which means we can also use safe accessors, and we don't need
> to worry about the divergent early paths.
> 
>>
>>> Do you have any more information about which system looks like this?
>>>
>> I'm not sure if I understand your question correctly, but I was just
>> booting an Intel based machine newer than ~2012. I have tested this on 4
>> different machines, on which 3 hit this code path. One was a HPE
>> ProLiant m510 Server with a XEON CPU
> 
> Broadwell.
> 
>> , second was a Mini PC with Celeron CPU,
> 
> Sorry, not enough information here to figure out the microarchitecture.

Alder Lake Celeron.

> 
>> and third was an old Intel NUC DCCP847DYE also with a Celeron CPU.
> 
> Sandy Bridge.
> 
>> The only system where I couldn't reproduce the issue was an old
>> workstation with a Gigabyte mainboard. It has the flag in the Bios to
>> set MSR_IA32_MISC_ENABLE, but I'm not sure if it was actually booting
>> via UEFI.
> 
> Same, not enough information here.

Ivy Bridge (Intel Core i5-3470)

> 
> But, it's clear that Intel's XD-disable is still honoured in EFI mode on
> a wide range of systems, and that we need a fix for UEFI.
> 
>>   I will verify this on monday. I booted all the 3 other systems
>> via UEFI -> Grub -> multiboot2. My grub entry looks like this:
>>
>> multiboot2 /boot/xen.gz dom0_mem=2656M,max:2656M watchdog ucode=scan
>> dom0_max_vcpus=1-8 crashkernel=256M,below=4G console=vga vga=mode-0x0311
>> module2 boot/vmlinuz console=hvc0 console=tty0 init=/bin/sh
>> module2 boot/initrd-dom0
>>
>>> trampoline_setup isn't executed on all EFI boots.  I had a different fix
>>> in mind, but it's a little more complicated.
>> Aha. yes, I didn't thought about other code paths.
> 
> https://xenbits.xen.org/docs/latest/hypervisor-guide/x86/how-xen-boots.html
> 
> Here's something I put together to cover some of these details.  But,
> most of the detail is in the source only.
> 
> ~Andrew



--
Julian Vetter | Vates Hypervisor & Kernel Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech