[RFC PATCH 00/18] KVM: VMX: Introduce Intel Mode-Based Execute Control (MBEC)

Jon Kohler posted 18 patches 9 months, 1 week ago
arch/x86/include/asm/kvm_host.h | 13 +++++----
arch/x86/include/asm/vmx.h      | 45 ++++++++++++++++++++---------
arch/x86/kvm/mmu.h              |  3 +-
arch/x86/kvm/mmu/mmu.c          | 13 +++++----
arch/x86/kvm/mmu/mmutrace.h     | 23 ++++++++++-----
arch/x86/kvm/mmu/paging_tmpl.h  | 19 +++++++++---
arch/x86/kvm/mmu/spte.c         | 51 ++++++++++++++++++++++++++++-----
arch/x86/kvm/mmu/spte.h         | 36 +++++++++++++++--------
arch/x86/kvm/mmu/tdp_mmu.c      |  2 +-
arch/x86/kvm/vmx/capabilities.h |  6 ++++
arch/x86/kvm/vmx/hyperv.c       |  5 +++-
arch/x86/kvm/vmx/hyperv_evmcs.h |  1 +
arch/x86/kvm/vmx/nested.c       |  4 +++
arch/x86/kvm/vmx/vmx.c          | 21 ++++++++++++--
arch/x86/kvm/vmx/vmx.h          |  7 +++++
arch/x86/kvm/x86.c              |  4 +++
16 files changed, 192 insertions(+), 61 deletions(-)
[RFC PATCH 00/18] KVM: VMX: Introduce Intel Mode-Based Execute Control (MBEC)
Posted by Jon Kohler 9 months, 1 week ago
## Summary
This series introduces support for Intel Mode-Based Execute Control
(MBEC) to KVM and nested VMX virtualization, aiming to significantly
reduce VMexits and improve performance for Windows guests running with
Hypervisor-Protected Code Integrity (HVCI).

## What?
Intel MBEC is a hardware feature, introduced in the Kabylake
generation, that allows for more granular control over execution
permissions. MBEC enables the separation and tracking of execution
permissions for supervisor (kernel) and user-mode code. It is used as
an accelerator for Microsoft's Memory Integrity [1] (also known as
hypervisor-protected code integrity or HVCI).

## Why?
The primary reason for this feature is performance.

Without hardware-level MBEC, enabling Windows HVCI runs a 'software
MBEC' known as Restricted User Mode, which imposes a runtime overhead
due to increased state transitions between the guest's L2 root
partition and the L2 secure partition for running kernel mode code
integrity operations.

In practice, this results in a significant number of exits. For
example, playing a YouTube video within the Edge Browser produces
roughly 1.2 million VMexits/second across an 8 vCPU Windows 11 guest.

Most of these exits are VMREAD/VMWRITE operations, which can be
emulated with Enlightened VMCS (eVMCS). However, even with eVMCS, this
configuration still produces around 200,000 VMexits/second.

With MBEC exposed to the L1 Windows Hypervisor, the same scenario
results in approximately 50,000 VMexits/second, a *24x* reduction from
the baseline.

Not a typo, 24x reduction in VMexits.

## How?
This series implements core KVM support for exposing the MBEC bit in
secondary execution controls (bit 22) to L1 and L2, based on
configuration from user space and a module parameter
'enable_pt_guest_exec_control'. The inspiration for this series
started with Mickaël's series for Heki [3], where we've extracted,
refactored, and extended the MBEC-specific use case to be
general-purpose.

MBEC, which appears in Linux /proc/cpuinfo as ept_mode_based_exec,
splits the EPT exec bit (bit 2 in PTE) into two bits. When secondary
execution control bit 22 is set, PTE bit 2 reflects supervisor mode
executable, and PTE bit 10 reflects user mode executable.

The semantics for EPT violation qualifications also change when MBEC
is enabled, with bit 5 reflecting supervisor/kernel mode execute
permissions and bit 6 reflecting user mode execute permissions.
This ultimately serves to expose this feature to the L1 hypervisor,
which consumes MBEC and informs the L2 partitions not to use the
software MBEC by removing bit 14 in 0x40000004 EAX [4].

## Where?
Enablement spans both VMX code and MMU code to teach the shadow MMU
about the different execution modes, as well as user space VMM to pass
secondary execution control bit 22. A patch for QEMU enablement is
available [5].

## Testing
Initial testing has been on done on 6.12-based code with:
  Guests
    - Windows 11 24H2 26100.2894
    - Windows Server 2025 24H2 26100.2894
    - Windows Server 2022 W1H2 20348.825
  Processors:
    - Intel Skylake 6154
    - Intel Sapphire Rapids 6444Y

## Acknowledgements
Special thanks to all contributors and reviewers who have provided
valuable feedback and support for this patch series.

[1] https://learn.microsoft.com/en-us/windows/security/hardware-security/enable-virtualization-based-protection-of-code-integrity
[2] https://learn.microsoft.com/en-us/virtualization/hyper-v-on-windows/tlfs/nested-virtualization#enlightened-vmcs-intel
[3] https://patchwork.kernel.org/project/kvm/patch/20231113022326.24388-6-mic@digikod.net/
[4] https://learn.microsoft.com/en-us/virtualization/hyper-v-on-windows/tlfs/feature-discovery#implementation-recommendations---0x40000004
[5] https://github.com/JonKohler/qemu/tree/mbec-rfc-v1

Cc: Alexander Grest <Alexander.Grest@microsoft.com>
Cc: Nicolas Saenz Julienne <nsaenz@amazon.es>
Cc: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Cc: Mickaël Salaün <mic@digikod.net>
Cc: Tao Su <tao1.su@linux.intel.com>
Cc: Xiaoyao Li <xiaoyao.li@intel.com>
Cc: Zhao Liu <zhao1.liu@intel.com>

Jon Kohler (11):
  KVM: x86: Add module parameter for Intel MBEC
  KVM: x86: Add pt_guest_exec_control to kvm_vcpu_arch
  KVM: VMX: Wire up Intel MBEC enable/disable logic
  KVM: x86/mmu: Remove SPTE_PERM_MASK
  KVM: VMX: Extend EPT Violation protection bits
  KVM: x86/mmu: Introduce shadow_ux_mask
  KVM: x86/mmu: Adjust SPTE_MMIO_ALLOWED_MASK to understand MBEC
  KVM: x86/mmu: Extend make_spte to understand MBEC
  KVM: nVMX: Setup Intel MBEC in nested secondary controls
  KVM: VMX: Allow MBEC with EVMCS
  KVM: x86: Enable module parameter for MBEC

Mickaël Salaün (5):
  KVM: VMX: add cpu_has_vmx_mbec helper
  KVM: VMX: Define VMX_EPT_USER_EXECUTABLE_MASK
  KVM: x86/mmu: Extend access bitfield in kvm_mmu_page_role
  KVM: VMX: Enhance EPT violation handler for PROT_USER_EXEC
  KVM: x86/mmu: Extend is_executable_pte to understand MBEC

Nikolay Borisov (1):
  KVM: VMX: Remove EPT_VIOLATIONS_ACC_*_BIT defines

Sean Christopherson (1):
  KVM: nVMX: Decouple EPT RWX bits from EPT Violation protection bits

 arch/x86/include/asm/kvm_host.h | 13 +++++----
 arch/x86/include/asm/vmx.h      | 45 ++++++++++++++++++++---------
 arch/x86/kvm/mmu.h              |  3 +-
 arch/x86/kvm/mmu/mmu.c          | 13 +++++----
 arch/x86/kvm/mmu/mmutrace.h     | 23 ++++++++++-----
 arch/x86/kvm/mmu/paging_tmpl.h  | 19 +++++++++---
 arch/x86/kvm/mmu/spte.c         | 51 ++++++++++++++++++++++++++++-----
 arch/x86/kvm/mmu/spte.h         | 36 +++++++++++++++--------
 arch/x86/kvm/mmu/tdp_mmu.c      |  2 +-
 arch/x86/kvm/vmx/capabilities.h |  6 ++++
 arch/x86/kvm/vmx/hyperv.c       |  5 +++-
 arch/x86/kvm/vmx/hyperv_evmcs.h |  1 +
 arch/x86/kvm/vmx/nested.c       |  4 +++
 arch/x86/kvm/vmx/vmx.c          | 21 ++++++++++++--
 arch/x86/kvm/vmx/vmx.h          |  7 +++++
 arch/x86/kvm/x86.c              |  4 +++
 16 files changed, 192 insertions(+), 61 deletions(-)

-- 
2.43.0

Re: [RFC PATCH 00/18] KVM: VMX: Introduce Intel Mode-Based Execute Control (MBEC)
Posted by Sean Christopherson 7 months, 1 week ago
On Thu, Mar 13, 2025, Jon Kohler wrote:
> ## Summary
> This series introduces support for Intel Mode-Based Execute Control
> (MBEC) to KVM and nested VMX virtualization, aiming to significantly
> reduce VMexits and improve performance for Windows guests running with
> Hypervisor-Protected Code Integrity (HVCI).

...

> ## Testing
> Initial testing has been on done on 6.12-based code with:
>   Guests
>     - Windows 11 24H2 26100.2894
>     - Windows Server 2025 24H2 26100.2894
>     - Windows Server 2022 W1H2 20348.825
>   Processors:
>     - Intel Skylake 6154
>     - Intel Sapphire Rapids 6444Y

This series needs testcases, and lots of 'em.  A short list off the top of my head:

 - New KVM-Unit-Test (KUT) ept_access_xxx testcases to verify KVM does the right
   thing with respect to user and supervisor code fetches when MBEC is:

     1. Supported and Enabled
     2. Supported but Disabled
     3. Unsupported

 - KUT testcases to verify VMLAUNCH/VMRESUME consistency checks.

 - KUT testcases to verify KVM treats WRITABLE+USER_EXEC as an illegal combination,
   i.e. that MBEC doesn't affect the W=1,R=0 behavior.

The access tests in particular absolutely need to be provided along with the next
version.  Unless I'm missing something, this RFC implementation is buggy throughout
due to tracking MBEC on a per-vCPU basis, and all of those bugs should be exposed
by even relative basic testcases.
Re: [RFC PATCH 00/18] KVM: VMX: Introduce Intel Mode-Based Execute Control (MBEC)
Posted by Jon Kohler 7 months, 1 week ago

> On May 12, 2025, at 5:46 PM, Sean Christopherson <seanjc@google.com> wrote:
> 
> !-------------------------------------------------------------------|
>  CAUTION: External Email
> 
> |-------------------------------------------------------------------!
> 
> On Thu, Mar 13, 2025, Jon Kohler wrote:
>> ## Summary
>> This series introduces support for Intel Mode-Based Execute Control
>> (MBEC) to KVM and nested VMX virtualization, aiming to significantly
>> reduce VMexits and improve performance for Windows guests running with
>> Hypervisor-Protected Code Integrity (HVCI).
> 
> ...
> 
>> ## Testing
>> Initial testing has been on done on 6.12-based code with:
>>  Guests
>>    - Windows 11 24H2 26100.2894
>>    - Windows Server 2025 24H2 26100.2894
>>    - Windows Server 2022 W1H2 20348.825
>>  Processors:
>>    - Intel Skylake 6154
>>    - Intel Sapphire Rapids 6444Y
> 
> This series needs testcases, and lots of 'em.  A short list off the top of my head:
> 
> - New KVM-Unit-Test (KUT) ept_access_xxx testcases to verify KVM does the right
>   thing with respect to user and supervisor code fetches when MBEC is:
> 
>     1. Supported and Enabled
>     2. Supported but Disabled
>     3. Unsupported
> 
> - KUT testcases to verify VMLAUNCH/VMRESUME consistency checks.
> 
> - KUT testcases to verify KVM treats WRITABLE+USER_EXEC as an illegal combination,
>   i.e. that MBEC doesn't affect the W=1,R=0 behavior.
> 
> The access tests in particular absolutely need to be provided along with the next
> version.  Unless I'm missing something, this RFC implementation is buggy throughout
> due to tracking MBEC on a per-vCPU basis, and all of those bugs should be exposed
> by even relative basic testcases.

Thanks for the review, Sean. I’ll work on rebasing my patches from 6.12 to latest
and incorporating the feedback across the board.

On the KUT side, good news is I already have most of that done-ish, so I’ll tune
them up when I get the next rev of the series, and send them both out together.

Re: [RFC PATCH 00/18] KVM: VMX: Introduce Intel Mode-Based Execute Control (MBEC)
Posted by Adrian-Ken Rueegsegger 7 months, 4 weeks ago
Hi,

On 3/13/25 21:36, Jon Kohler wrote:

[snip]

> The semantics for EPT violation qualifications also change when MBEC
> is enabled, with bit 5 reflecting supervisor/kernel mode execute
> permissions and bit 6 reflecting user mode execute permissions.
> This ultimately serves to expose this feature to the L1 hypervisor,
> which consumes MBEC and informs the L2 partitions not to use the
> software MBEC by removing bit 14 in 0x40000004 EAX [4].

Should this say bit 13 of 0x40000004.EAX? According to the referenced 
docs [4]:

Bit 13: "Recommend using INT for MBEC system calls."

Bit 14: "Recommend a nested hypervisor using the enlightened VMCS 
interface. Also indicates that additional nested enlightenments may be 
available (see leaf 0x4000000A)."

Regards,
Adrian
Re: [RFC PATCH 00/18] KVM: VMX: Introduce Intel Mode-Based Execute Control (MBEC)
Posted by Jon Kohler 7 months, 1 week ago

> On Apr 23, 2025, at 9:54 AM, Adrian-Ken Rueegsegger <ken@codelabs.ch> wrote:
> 
> !-------------------------------------------------------------------|
> CAUTION: External Email
> 
> |-------------------------------------------------------------------!
> 
> Hi,
> 
> On 3/13/25 21:36, Jon Kohler wrote:
> 
> [snip]
> 
>> The semantics for EPT violation qualifications also change when MBEC
>> is enabled, with bit 5 reflecting supervisor/kernel mode execute
>> permissions and bit 6 reflecting user mode execute permissions.
>> This ultimately serves to expose this feature to the L1 hypervisor,
>> which consumes MBEC and informs the L2 partitions not to use the
>> software MBEC by removing bit 14 in 0x40000004 EAX [4].
> 
> Should this say bit 13 of 0x40000004.EAX? According to the referenced docs [4]:
> 
> Bit 13: "Recommend using INT for MBEC system calls."
> 
> Bit 14: "Recommend a nested hypervisor using the enlightened VMCS interface. Also indicates that additional nested enlightenments may be available (see leaf 0x4000000A)."
> 
> Regards,
> Adrian

Yes, you are correct, I’ll fix on the next go-around, thanks for
pointing that out
Re: [RFC PATCH 00/18] KVM: VMX: Introduce Intel Mode-Based Execute Control (MBEC)
Posted by Mickaël Salaün 8 months ago
Hi,

This series looks good, just some inlined questions.

Sean, Paolo, what do you think?

Jon, what is the status of the QEMU patches?

Regards,
 Mickaël

On Thu, Mar 13, 2025 at 01:36:39PM -0700, Jon Kohler wrote:
> ## Summary
> This series introduces support for Intel Mode-Based Execute Control
> (MBEC) to KVM and nested VMX virtualization, aiming to significantly
> reduce VMexits and improve performance for Windows guests running with
> Hypervisor-Protected Code Integrity (HVCI).
> 
> ## What?
> Intel MBEC is a hardware feature, introduced in the Kabylake
> generation, that allows for more granular control over execution
> permissions. MBEC enables the separation and tracking of execution
> permissions for supervisor (kernel) and user-mode code. It is used as
> an accelerator for Microsoft's Memory Integrity [1] (also known as
> hypervisor-protected code integrity or HVCI).
> 
> ## Why?
> The primary reason for this feature is performance.
> 
> Without hardware-level MBEC, enabling Windows HVCI runs a 'software
> MBEC' known as Restricted User Mode, which imposes a runtime overhead
> due to increased state transitions between the guest's L2 root
> partition and the L2 secure partition for running kernel mode code
> integrity operations.
> 
> In practice, this results in a significant number of exits. For
> example, playing a YouTube video within the Edge Browser produces
> roughly 1.2 million VMexits/second across an 8 vCPU Windows 11 guest.
> 
> Most of these exits are VMREAD/VMWRITE operations, which can be
> emulated with Enlightened VMCS (eVMCS). However, even with eVMCS, this
> configuration still produces around 200,000 VMexits/second.
> 
> With MBEC exposed to the L1 Windows Hypervisor, the same scenario
> results in approximately 50,000 VMexits/second, a *24x* reduction from
> the baseline.
> 
> Not a typo, 24x reduction in VMexits.
> 
> ## How?
> This series implements core KVM support for exposing the MBEC bit in
> secondary execution controls (bit 22) to L1 and L2, based on
> configuration from user space and a module parameter
> 'enable_pt_guest_exec_control'. The inspiration for this series
> started with Mickaël's series for Heki [3], where we've extracted,
> refactored, and extended the MBEC-specific use case to be
> general-purpose.
> 
> MBEC, which appears in Linux /proc/cpuinfo as ept_mode_based_exec,
> splits the EPT exec bit (bit 2 in PTE) into two bits. When secondary
> execution control bit 22 is set, PTE bit 2 reflects supervisor mode
> executable, and PTE bit 10 reflects user mode executable.
> 
> The semantics for EPT violation qualifications also change when MBEC
> is enabled, with bit 5 reflecting supervisor/kernel mode execute
> permissions and bit 6 reflecting user mode execute permissions.
> This ultimately serves to expose this feature to the L1 hypervisor,
> which consumes MBEC and informs the L2 partitions not to use the
> software MBEC by removing bit 14 in 0x40000004 EAX [4].
> 
> ## Where?
> Enablement spans both VMX code and MMU code to teach the shadow MMU
> about the different execution modes, as well as user space VMM to pass
> secondary execution control bit 22. A patch for QEMU enablement is
> available [5].
> 
> ## Testing
> Initial testing has been on done on 6.12-based code with:
>   Guests
>     - Windows 11 24H2 26100.2894
>     - Windows Server 2025 24H2 26100.2894
>     - Windows Server 2022 W1H2 20348.825
>   Processors:
>     - Intel Skylake 6154
>     - Intel Sapphire Rapids 6444Y
> 
> ## Acknowledgements
> Special thanks to all contributors and reviewers who have provided
> valuable feedback and support for this patch series.
> 
> [1] https://learn.microsoft.com/en-us/windows/security/hardware-security/enable-virtualization-based-protection-of-code-integrity
> [2] https://learn.microsoft.com/en-us/virtualization/hyper-v-on-windows/tlfs/nested-virtualization#enlightened-vmcs-intel
> [3] https://patchwork.kernel.org/project/kvm/patch/20231113022326.24388-6-mic@digikod.net/
> [4] https://learn.microsoft.com/en-us/virtualization/hyper-v-on-windows/tlfs/feature-discovery#implementation-recommendations---0x40000004
> [5] https://github.com/JonKohler/qemu/tree/mbec-rfc-v1
> 
> Cc: Alexander Grest <Alexander.Grest@microsoft.com>
> Cc: Nicolas Saenz Julienne <nsaenz@amazon.es>
> Cc: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
> Cc: Mickaël Salaün <mic@digikod.net>
> Cc: Tao Su <tao1.su@linux.intel.com>
> Cc: Xiaoyao Li <xiaoyao.li@intel.com>
> Cc: Zhao Liu <zhao1.liu@intel.com>
> 
> Jon Kohler (11):
>   KVM: x86: Add module parameter for Intel MBEC
>   KVM: x86: Add pt_guest_exec_control to kvm_vcpu_arch
>   KVM: VMX: Wire up Intel MBEC enable/disable logic
>   KVM: x86/mmu: Remove SPTE_PERM_MASK
>   KVM: VMX: Extend EPT Violation protection bits
>   KVM: x86/mmu: Introduce shadow_ux_mask
>   KVM: x86/mmu: Adjust SPTE_MMIO_ALLOWED_MASK to understand MBEC
>   KVM: x86/mmu: Extend make_spte to understand MBEC
>   KVM: nVMX: Setup Intel MBEC in nested secondary controls
>   KVM: VMX: Allow MBEC with EVMCS
>   KVM: x86: Enable module parameter for MBEC
> 
> Mickaël Salaün (5):
>   KVM: VMX: add cpu_has_vmx_mbec helper
>   KVM: VMX: Define VMX_EPT_USER_EXECUTABLE_MASK
>   KVM: x86/mmu: Extend access bitfield in kvm_mmu_page_role
>   KVM: VMX: Enhance EPT violation handler for PROT_USER_EXEC
>   KVM: x86/mmu: Extend is_executable_pte to understand MBEC
> 
> Nikolay Borisov (1):
>   KVM: VMX: Remove EPT_VIOLATIONS_ACC_*_BIT defines
> 
> Sean Christopherson (1):
>   KVM: nVMX: Decouple EPT RWX bits from EPT Violation protection bits
> 
>  arch/x86/include/asm/kvm_host.h | 13 +++++----
>  arch/x86/include/asm/vmx.h      | 45 ++++++++++++++++++++---------
>  arch/x86/kvm/mmu.h              |  3 +-
>  arch/x86/kvm/mmu/mmu.c          | 13 +++++----
>  arch/x86/kvm/mmu/mmutrace.h     | 23 ++++++++++-----
>  arch/x86/kvm/mmu/paging_tmpl.h  | 19 +++++++++---
>  arch/x86/kvm/mmu/spte.c         | 51 ++++++++++++++++++++++++++++-----
>  arch/x86/kvm/mmu/spte.h         | 36 +++++++++++++++--------
>  arch/x86/kvm/mmu/tdp_mmu.c      |  2 +-
>  arch/x86/kvm/vmx/capabilities.h |  6 ++++
>  arch/x86/kvm/vmx/hyperv.c       |  5 +++-
>  arch/x86/kvm/vmx/hyperv_evmcs.h |  1 +
>  arch/x86/kvm/vmx/nested.c       |  4 +++
>  arch/x86/kvm/vmx/vmx.c          | 21 ++++++++++++--
>  arch/x86/kvm/vmx/vmx.h          |  7 +++++
>  arch/x86/kvm/x86.c              |  4 +++
>  16 files changed, 192 insertions(+), 61 deletions(-)
> 
> -- 
> 2.43.0
> 
> 
Re: [RFC PATCH 00/18] KVM: VMX: Introduce Intel Mode-Based Execute Control (MBEC)
Posted by Jon Kohler 8 months ago

> On Apr 15, 2025, at 5:29 AM, Mickaël Salaün <mic@digikod.net> wrote:
> 
> !-------------------------------------------------------------------|
>  CAUTION: External Email
> 
> |-------------------------------------------------------------------!
> 
> Hi,
> 
> This series looks good, just some inlined questions.

RE Inlined questions - Did you send those elsewhere? I didn’t
see any others in my inbox, nor on lore.

> Sean, Paolo, what do you think?
> 
> Jon, what is the status of the QEMU patches?

I was waiting for comments here before sending to mailing list, but
I did post a link to the tree in the cover letter. The actual commit itself
is wicked trivial, so knock on wood, I’d imagine that would be the easiest
part of this endeavor.

Would you suggest I sent those to QEMU mailing list now, while kernel side
is still in RFC? Happy to do so if that makes sense.

https://github.com/JonKohler/qemu/commit/7a245414a0138b83cabcb809f5585ef8b5f78553

> Regards,
> Mickaël
> 
> On Thu, Mar 13, 2025 at 01:36:39PM -0700, Jon Kohler wrote:
>> ## Summary
>> This series introduces support for Intel Mode-Based Execute Control
>> (MBEC) to KVM and nested VMX virtualization, aiming to significantly
>> reduce VMexits and improve performance for Windows guests running with
>> Hypervisor-Protected Code Integrity (HVCI).
>> 
>> ## What?
>> Intel MBEC is a hardware feature, introduced in the Kabylake
>> generation, that allows for more granular control over execution
>> permissions. MBEC enables the separation and tracking of execution
>> permissions for supervisor (kernel) and user-mode code. It is used as
>> an accelerator for Microsoft's Memory Integrity [1] (also known as
>> hypervisor-protected code integrity or HVCI).
>> 
>> ## Why?
>> The primary reason for this feature is performance.
>> 
>> Without hardware-level MBEC, enabling Windows HVCI runs a 'software
>> MBEC' known as Restricted User Mode, which imposes a runtime overhead
>> due to increased state transitions between the guest's L2 root
>> partition and the L2 secure partition for running kernel mode code
>> integrity operations.
>> 
>> In practice, this results in a significant number of exits. For
>> example, playing a YouTube video within the Edge Browser produces
>> roughly 1.2 million VMexits/second across an 8 vCPU Windows 11 guest.
>> 
>> Most of these exits are VMREAD/VMWRITE operations, which can be
>> emulated with Enlightened VMCS (eVMCS). However, even with eVMCS, this
>> configuration still produces around 200,000 VMexits/second.
>> 
>> With MBEC exposed to the L1 Windows Hypervisor, the same scenario
>> results in approximately 50,000 VMexits/second, a *24x* reduction from
>> the baseline.
>> 
>> Not a typo, 24x reduction in VMexits.
>> 
>> ## How?
>> This series implements core KVM support for exposing the MBEC bit in
>> secondary execution controls (bit 22) to L1 and L2, based on
>> configuration from user space and a module parameter
>> 'enable_pt_guest_exec_control'. The inspiration for this series
>> started with Mickaël's series for Heki [3], where we've extracted,
>> refactored, and extended the MBEC-specific use case to be
>> general-purpose.
>> 
>> MBEC, which appears in Linux /proc/cpuinfo as ept_mode_based_exec,
>> splits the EPT exec bit (bit 2 in PTE) into two bits. When secondary
>> execution control bit 22 is set, PTE bit 2 reflects supervisor mode
>> executable, and PTE bit 10 reflects user mode executable.
>> 
>> The semantics for EPT violation qualifications also change when MBEC
>> is enabled, with bit 5 reflecting supervisor/kernel mode execute
>> permissions and bit 6 reflecting user mode execute permissions.
>> This ultimately serves to expose this feature to the L1 hypervisor,
>> which consumes MBEC and informs the L2 partitions not to use the
>> software MBEC by removing bit 14 in 0x40000004 EAX [4].
>> 
>> ## Where?
>> Enablement spans both VMX code and MMU code to teach the shadow MMU
>> about the different execution modes, as well as user space VMM to pass
>> secondary execution control bit 22. A patch for QEMU enablement is
>> available [5].
>> 
>> ## Testing
>> Initial testing has been on done on 6.12-based code with:
>>  Guests
>>    - Windows 11 24H2 26100.2894
>>    - Windows Server 2025 24H2 26100.2894
>>    - Windows Server 2022 W1H2 20348.825
>>  Processors:
>>    - Intel Skylake 6154
>>    - Intel Sapphire Rapids 6444Y
>> 
>> ## Acknowledgements
>> Special thanks to all contributors and reviewers who have provided
>> valuable feedback and support for this patch series.
>> 
>> [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__learn.microsoft.com_en-2Dus_windows_security_hardware-2Dsecurity_enable-2Dvirtualization-2Dbased-2Dprotection-2Dof-2Dcode-2Dintegrity&d=DwIDaQ&c=s883GpUCOChKOHiocYtGcg&r=NGPRGGo37mQiSXgHKm5rCQ&m=Ewu6Hbm89GQnNaITHMqOckhba692_gB10PKG0rNe4hOr0rIOgaQpYK-DfIdBzjcm&s=DPxad8XItb3O5-k8Gsy0LeE3W_1x1irynDTwm-479Zg&e=
>> [2] https://urldefense.proofpoint.com/v2/url?u=https-3A__learn.microsoft.com_en-2Dus_virtualization_hyper-2Dv-2Don-2Dwindows_tlfs_nested-2Dvirtualization-23enlightened-2Dvmcs-2Dintel&d=DwIDaQ&c=s883GpUCOChKOHiocYtGcg&r=NGPRGGo37mQiSXgHKm5rCQ&m=Ewu6Hbm89GQnNaITHMqOckhba692_gB10PKG0rNe4hOr0rIOgaQpYK-DfIdBzjcm&s=xlj7veNuJTBSOyW3RkuSkvMelWN00qLahH5VO1UFpuY&e=
>> [3] https://urldefense.proofpoint.com/v2/url?u=https-3A__patchwork.kernel.org_project_kvm_patch_20231113022326.24388-2D6-2Dmic-40digikod.net_&d=DwIDaQ&c=s883GpUCOChKOHiocYtGcg&r=NGPRGGo37mQiSXgHKm5rCQ&m=Ewu6Hbm89GQnNaITHMqOckhba692_gB10PKG0rNe4hOr0rIOgaQpYK-DfIdBzjcm&s=CpP7GYp_yjpWwZRjFEjzi6Kn2VbZm4qrFRFbpMDuAyk&e=
>> [4] https://urldefense.proofpoint.com/v2/url?u=https-3A__learn.microsoft.com_en-2Dus_virtualization_hyper-2Dv-2Don-2Dwindows_tlfs_feature-2Ddiscovery-23implementation-2Drecommendations-2D-2D-2D0x40000004&d=DwIDaQ&c=s883GpUCOChKOHiocYtGcg&r=NGPRGGo37mQiSXgHKm5rCQ&m=Ewu6Hbm89GQnNaITHMqOckhba692_gB10PKG0rNe4hOr0rIOgaQpYK-DfIdBzjcm&s=sSrPcF9R4QfC8hI-x9o4BWcA3S5N3_7EsAUMkTGa-aU&e=
>> [5] https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JonKohler_qemu_tree_mbec-2Drfc-2Dv1&d=DwIDaQ&c=s883GpUCOChKOHiocYtGcg&r=NGPRGGo37mQiSXgHKm5rCQ&m=Ewu6Hbm89GQnNaITHMqOckhba692_gB10PKG0rNe4hOr0rIOgaQpYK-DfIdBzjcm&s=2KyY0t6Q01ndYAWKGgsJCkE4UBURU487tPSzzjIpFfQ&e=
>> 
>> Cc: Alexander Grest <Alexander.Grest@microsoft.com>
>> Cc: Nicolas Saenz Julienne <nsaenz@amazon.es>
>> Cc: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
>> Cc: Mickaël Salaün <mic@digikod.net>
>> Cc: Tao Su <tao1.su@linux.intel.com>
>> Cc: Xiaoyao Li <xiaoyao.li@intel.com>
>> Cc: Zhao Liu <zhao1.liu@intel.com>
>> 
>> Jon Kohler (11):
>>  KVM: x86: Add module parameter for Intel MBEC
>>  KVM: x86: Add pt_guest_exec_control to kvm_vcpu_arch
>>  KVM: VMX: Wire up Intel MBEC enable/disable logic
>>  KVM: x86/mmu: Remove SPTE_PERM_MASK
>>  KVM: VMX: Extend EPT Violation protection bits
>>  KVM: x86/mmu: Introduce shadow_ux_mask
>>  KVM: x86/mmu: Adjust SPTE_MMIO_ALLOWED_MASK to understand MBEC
>>  KVM: x86/mmu: Extend make_spte to understand MBEC
>>  KVM: nVMX: Setup Intel MBEC in nested secondary controls
>>  KVM: VMX: Allow MBEC with EVMCS
>>  KVM: x86: Enable module parameter for MBEC
>> 
>> Mickaël Salaün (5):
>>  KVM: VMX: add cpu_has_vmx_mbec helper
>>  KVM: VMX: Define VMX_EPT_USER_EXECUTABLE_MASK
>>  KVM: x86/mmu: Extend access bitfield in kvm_mmu_page_role
>>  KVM: VMX: Enhance EPT violation handler for PROT_USER_EXEC
>>  KVM: x86/mmu: Extend is_executable_pte to understand MBEC
>> 
>> Nikolay Borisov (1):
>>  KVM: VMX: Remove EPT_VIOLATIONS_ACC_*_BIT defines
>> 
>> Sean Christopherson (1):
>>  KVM: nVMX: Decouple EPT RWX bits from EPT Violation protection bits
>> 
>> arch/x86/include/asm/kvm_host.h | 13 +++++----
>> arch/x86/include/asm/vmx.h      | 45 ++++++++++++++++++++---------
>> arch/x86/kvm/mmu.h              |  3 +-
>> arch/x86/kvm/mmu/mmu.c          | 13 +++++----
>> arch/x86/kvm/mmu/mmutrace.h     | 23 ++++++++++-----
>> arch/x86/kvm/mmu/paging_tmpl.h  | 19 +++++++++---
>> arch/x86/kvm/mmu/spte.c         | 51 ++++++++++++++++++++++++++++-----
>> arch/x86/kvm/mmu/spte.h         | 36 +++++++++++++++--------
>> arch/x86/kvm/mmu/tdp_mmu.c      |  2 +-
>> arch/x86/kvm/vmx/capabilities.h |  6 ++++
>> arch/x86/kvm/vmx/hyperv.c       |  5 +++-
>> arch/x86/kvm/vmx/hyperv_evmcs.h |  1 +
>> arch/x86/kvm/vmx/nested.c       |  4 +++
>> arch/x86/kvm/vmx/vmx.c          | 21 ++++++++++++--
>> arch/x86/kvm/vmx/vmx.h          |  7 +++++
>> arch/x86/kvm/x86.c              |  4 +++
>> 16 files changed, 192 insertions(+), 61 deletions(-)
>> 
>> -- 
>> 2.43.0


Re: [RFC PATCH 00/18] KVM: VMX: Introduce Intel Mode-Based Execute Control (MBEC)
Posted by Mickaël Salaün 8 months ago
On Tue, Apr 15, 2025 at 02:43:57PM +0000, Jon Kohler wrote:
> 
> 
> > On Apr 15, 2025, at 5:29 AM, Mickaël Salaün <mic@digikod.net> wrote:
> > 
> > !-------------------------------------------------------------------|
> >  CAUTION: External Email
> > 
> > |-------------------------------------------------------------------!
> > 
> > Hi,
> > 
> > This series looks good, just some inlined questions.
> 
> RE Inlined questions - Did you send those elsewhere? I didn’t
> see any others in my inbox, nor on lore.

No, I just wanted to highlight that you inserted questions in several
patches. :)

> 
> > Sean, Paolo, what do you think?
> > 
> > Jon, what is the status of the QEMU patches?
> 
> I was waiting for comments here before sending to mailing list, but
> I did post a link to the tree in the cover letter. The actual commit itself
> is wicked trivial, so knock on wood, I’d imagine that would be the easiest
> part of this endeavor.
> 
> Would you suggest I sent those to QEMU mailing list now, while kernel side
> is still in RFC? Happy to do so if that makes sense.
> 
> https://github.com/JonKohler/qemu/commit/7a245414a0138b83cabcb809f5585ef8b5f78553

You can wait until Sean gets a look at this series, but you don't need
to wait for it to be merged before starting a discussion with QEMU
developers.
Re: [RFC PATCH 00/18] KVM: VMX: Introduce Intel Mode-Based Execute Control (MBEC)
Posted by Sean Christopherson 8 months ago
On Tue, Apr 15, 2025, Mickaël Salaün wrote:
> Hi,
> 
> This series looks good, just some inlined questions.
> 
> Sean, Paolo, what do you think?

It's high up on my todo, but I've been swamped with non-upstream stuff for the
last few weeks (and I'm not quite out of the woods), so I might not get to it
this week.
Re: [RFC PATCH 00/18] KVM: VMX: Introduce Intel Mode-Based Execute Control (MBEC)
Posted by Jon Kohler 7 months, 1 week ago

> On Apr 15, 2025, at 10:43 AM, Sean Christopherson <seanjc@google.com> wrote:
> 
> !-------------------------------------------------------------------|
>  CAUTION: External Email
> 
> |-------------------------------------------------------------------!
> 
> On Tue, Apr 15, 2025, Mickaël Salaün wrote:
>> Hi,
>> 
>> This series looks good, just some inlined questions.
>> 
>> Sean, Paolo, what do you think?
> 
> It's high up on my todo, but I've been swamped with non-upstream stuff for the
> last few weeks (and I'm not quite out of the woods), so I might not get to it
> this week.

Gentle ping on this series, I know you’ve been swamped, any
line of sight on getting out of the woods? Just got back from travel
myself so I’m catching up on todos