[PATCH v8 00/10] VMSCAPE optimization for BHI variant

Pawan Gupta posted 10 patches 1 week, 2 days ago
Documentation/admin-guide/hw-vuln/vmscape.rst   | 15 ++++-
Documentation/admin-guide/kernel-parameters.txt |  6 +-
arch/x86/Kconfig                                |  1 +
arch/x86/entry/entry_64.S                       | 34 +++++++----
arch/x86/include/asm/cpufeatures.h              |  2 +-
arch/x86/include/asm/entry-common.h             |  9 ++-
arch/x86/include/asm/nospec-branch.h            | 13 +++--
arch/x86/include/asm/processor.h                |  1 +
arch/x86/kernel/cpu/bugs.c                      | 76 ++++++++++++++++++++-----
arch/x86/kvm/x86.c                              |  4 +-
arch/x86/net/bpf_jit_comp.c                     | 11 +---
11 files changed, 127 insertions(+), 45 deletions(-)
[PATCH v8 00/10] VMSCAPE optimization for BHI variant
Posted by Pawan Gupta 1 week, 2 days ago
v8:
- Use helper in KVM to convey the mitigation status. (PeterZ/Borisov)
- Fix the documentation for default vmscape mitigation. (BPF bot)
- Remove the stray lines in bug.c (BPF bot).
- Updated commit messages and comments.
- Rebased to v7.0-rc5.

v7: https://lore.kernel.org/r/20260319-vmscape-bhb-v7-0-b76a777a98af@linux.intel.com
- s/This allows/Allow/ and s/This does adds/This adds/ in patch 1/10 commit
  message (Borislav).
- Minimize register usage in BHB clearing seq. (David Laight)
  - Instead of separate ecx/eax counters, use al/ah.
  - Adjust the alignment of RET due to register size change.
  - save/restore rax in the seq itself.
  - Remove the save/restore of rax/rcx for BPF callers.
- Rename clear_bhb_loop() to clear_bhb_loop_nofence() to make it
  obvious that the LFENCE is not part of the sequence (Borislav).
- Fix Kconfig: s/select/depends on/ HAVE_STATIC_CALL (PeterZ).
- Rebased to v7.0-rc4.

v6: https://lore.kernel.org/r/20251201-vmscape-bhb-v6-0-d610dd515714@linux.intel.com
- Remove semicolon at the end of asm in ALTERNATIVE (Uros).
- Fix build warning in vmscape_select_mitigation() (LKP).
- Rebased to v6.18.

v5: https://lore.kernel.org/r/20251126-vmscape-bhb-v5-2-02d66e423b00@linux.intel.com
- For BHI seq, limit runtime-patching to loop counts only (Dave).
  Dropped 2 patches that moved the BHB seq to a macro.
- Remove redundant switch cases in vmscape_select_mitigation() (Nikolay).
- Improve commit message (Nikolay).
- Collected tags.

v4: https://lore.kernel.org/r/20251119-vmscape-bhb-v4-0-1adad4e69ddc@linux.intel.com
- Move LFENCE to the callsite, out of clear_bhb_loop(). (Dave)
- Make clear_bhb_loop() work for larger BHB. (Dave)
  This now uses hardware enumeration to determine the BHB size to clear.
- Use write_ibpb() instead of indirect_branch_prediction_barrier() when
  IBPB is known to be available. (Dave)
- Use static_call() to simplify mitigation at exit-to-userspace. (Dave)
- Refactor vmscape_select_mitigation(). (Dave)
- Fix vmscape=on which was wrongly behaving as AUTO. (Dave)
- Split the patches. (Dave)
  - Patch 1-4 prepares for making the sequence flexible for VMSCAPE use.
  - Patch 5 trivial rename of variable.
  - Patch 6-8 prepares for deploying BHB mitigation for VMSCAPE.
  - Patch 9 deploys the mitigation.
  - Patch 10-11 fixes ON Vs AUTO mode.

v3: https://lore.kernel.org/r/20251027-vmscape-bhb-v3-0-5793c2534e93@linux.intel.com
- s/x86_pred_flush_pending/x86_predictor_flush_exit_to_user/ (Sean).
- Removed IBPB & BHB-clear mutual exclusion at exit-to-userspace.
- Collected tags.

v2: https://lore.kernel.org/r/20251015-vmscape-bhb-v2-0-91cbdd9c3a96@linux.intel.com
- Added check for IBPB feature in vmscape_select_mitigation(). (David)
- s/vmscape=auto/vmscape=on/ (David)
- Added patch to remove LFENCE from VMSCAPE BHB-clear sequence.
- Rebased to v6.18-rc1.

v1: https://lore.kernel.org/r/20250924-vmscape-bhb-v1-0-da51f0e1934d@linux.intel.com

Hi All,

These patches aim to improve the performance of a recent mitigation for
VMSCAPE[1] vulnerability. This improvement is relevant for BHI variant of
VMSCAPE that affect Alder Lake and newer processors.

The current mitigation approach uses IBPB on kvm-exit-to-userspace for all
affected range of CPUs. This is an overkill for CPUs that are only affected
by the BHI variant. On such CPUs clearing the branch history is sufficient
for VMSCAPE, and also more apt as the underlying issue is due to poisoned
branch history.

Below is the iPerf data for transfer between guest and host, comparing IBPB
and BHB-clear mitigation. BHB-clear shows performance improvement over IBPB
in most cases.

Platform: Emerald Rapids
Baseline: vmscape=off
Target: IBPB at VMexit-to-userspace Vs the new BHB-clear at
	VMexit-to-userspace mitigation (both compared against baseline).

(pN = N parallel connections)

| iPerf user-net | IBPB    | BHB Clear |
|----------------|---------|-----------|
| UDP 1-vCPU_p1  | -12.5%  |   1.3%    |
| TCP 1-vCPU_p1  | -10.4%  |  -1.5%    |
| TCP 1-vCPU_p1  | -7.5%   |  -3.0%    |
| UDP 4-vCPU_p16 | -3.7%   |  -3.7%    |
| TCP 4-vCPU_p4  | -2.9%   |  -1.4%    |
| UDP 4-vCPU_p4  | -0.6%   |   0.0%    |
| TCP 4-vCPU_p4  |  3.5%   |   0.0%    |

| iPerf bridge-net | IBPB    | BHB Clear |
|------------------|---------|-----------|
| UDP 1-vCPU_p1    | -9.4%   |  -0.4%    |
| TCP 1-vCPU_p1    | -3.9%   |  -0.5%    |
| UDP 4-vCPU_p16   | -2.2%   |  -3.8%    |
| TCP 4-vCPU_p4    | -1.0%   |  -1.0%    |
| TCP 4-vCPU_p4    |  0.5%   |   0.5%    |
| UDP 4-vCPU_p4    |  0.0%   |   0.9%    |
| TCP 1-vCPU_p1    |  0.0%   |   0.9%    |

| iPerf vhost-net | IBPB    | BHB Clear |
|-----------------|---------|-----------|
| UDP 1-vCPU_p1   | -4.3%   |   1.0%    |
| TCP 1-vCPU_p1   | -3.8%   |  -0.5%    |
| TCP 1-vCPU_p1   | -2.7%   |  -0.7%    |
| UDP 4-vCPU_p16  | -0.7%   |  -2.2%    |
| TCP 4-vCPU_p4   | -0.4%   |   0.8%    |
| UDP 4-vCPU_p4   |  0.4%   |  -0.7%    |
| TCP 4-vCPU_p4   |  0.0%   |   0.6%    |

[1] https://comsec.ethz.ch/research/microarch/vmscape-exposing-and-exploiting-incomplete-branch-predictor-isolation-in-cloud-environments/

---
Pawan Gupta (10):
      x86/bhi: x86/vmscape: Move LFENCE out of clear_bhb_loop()
      x86/bhi: Make clear_bhb_loop() effective on newer CPUs
      x86/bhi: Rename clear_bhb_loop() to clear_bhb_loop_nofence()
      x86/vmscape: Rename x86_ibpb_exit_to_user to x86_predictor_flush_exit_to_user
      x86/vmscape: Move mitigation selection to a switch()
      x86/vmscape: Use write_ibpb() instead of indirect_branch_prediction_barrier()
      x86/vmscape: Use static_call() for predictor flush
      x86/vmscape: Deploy BHB clearing mitigation
      x86/vmscape: Resolve conflict between attack-vectors and vmscape=force
      x86/vmscape: Add cmdline vmscape=on to override attack vector controls

 Documentation/admin-guide/hw-vuln/vmscape.rst   | 15 ++++-
 Documentation/admin-guide/kernel-parameters.txt |  6 +-
 arch/x86/Kconfig                                |  1 +
 arch/x86/entry/entry_64.S                       | 34 +++++++----
 arch/x86/include/asm/cpufeatures.h              |  2 +-
 arch/x86/include/asm/entry-common.h             |  9 ++-
 arch/x86/include/asm/nospec-branch.h            | 13 +++--
 arch/x86/include/asm/processor.h                |  1 +
 arch/x86/kernel/cpu/bugs.c                      | 76 ++++++++++++++++++++-----
 arch/x86/kvm/x86.c                              |  4 +-
 arch/x86/net/bpf_jit_comp.c                     | 11 +---
 11 files changed, 127 insertions(+), 45 deletions(-)
---
base-commit: c369299895a591d96745d6492d4888259b004a9e
change-id: 20250916-vmscape-bhb-d7d469977f2f

Best regards,
--  
Thanks,
Pawan
Re: [PATCH v8 00/10] VMSCAPE optimization for BHI variant
Posted by Jon Kohler 3 days, 20 hours ago

> On Mar 24, 2026, at 2:16 PM, Pawan Gupta <pawan.kumar.gupta@linux.intel.com> wrote:
> 
> v8:
> - Use helper in KVM to convey the mitigation status. (PeterZ/Borisov)
> - Fix the documentation for default vmscape mitigation. (BPF bot)
> - Remove the stray lines in bug.c (BPF bot).
> - Updated commit messages and comments.
> - Rebased to v7.0-rc5.
> 
> v7: https://lore.kernel.org/r/20260319-vmscape-bhb-v7-0-b76a777a98af@linux.intel.com
> - s/This allows/Allow/ and s/This does adds/This adds/ in patch 1/10 commit
>  message (Borislav).
> - Minimize register usage in BHB clearing seq. (David Laight)
>  - Instead of separate ecx/eax counters, use al/ah.
>  - Adjust the alignment of RET due to register size change.
>  - save/restore rax in the seq itself.
>  - Remove the save/restore of rax/rcx for BPF callers.
> - Rename clear_bhb_loop() to clear_bhb_loop_nofence() to make it
>  obvious that the LFENCE is not part of the sequence (Borislav).
> - Fix Kconfig: s/select/depends on/ HAVE_STATIC_CALL (PeterZ).
> - Rebased to v7.0-rc4.
> 
> v6: https://lore.kernel.org/r/20251201-vmscape-bhb-v6-0-d610dd515714@linux.intel.com
> - Remove semicolon at the end of asm in ALTERNATIVE (Uros).
> - Fix build warning in vmscape_select_mitigation() (LKP).
> - Rebased to v6.18.
> 
> v5: https://lore.kernel.org/r/20251126-vmscape-bhb-v5-2-02d66e423b00@linux.intel.com
> - For BHI seq, limit runtime-patching to loop counts only (Dave).
>  Dropped 2 patches that moved the BHB seq to a macro.
> - Remove redundant switch cases in vmscape_select_mitigation() (Nikolay).
> - Improve commit message (Nikolay).
> - Collected tags.
> 
> v4: https://lore.kernel.org/r/20251119-vmscape-bhb-v4-0-1adad4e69ddc@linux.intel.com
> - Move LFENCE to the callsite, out of clear_bhb_loop(). (Dave)
> - Make clear_bhb_loop() work for larger BHB. (Dave)
>  This now uses hardware enumeration to determine the BHB size to clear.
> - Use write_ibpb() instead of indirect_branch_prediction_barrier() when
>  IBPB is known to be available. (Dave)
> - Use static_call() to simplify mitigation at exit-to-userspace. (Dave)
> - Refactor vmscape_select_mitigation(). (Dave)
> - Fix vmscape=on which was wrongly behaving as AUTO. (Dave)
> - Split the patches. (Dave)
>  - Patch 1-4 prepares for making the sequence flexible for VMSCAPE use.
>  - Patch 5 trivial rename of variable.
>  - Patch 6-8 prepares for deploying BHB mitigation for VMSCAPE.
>  - Patch 9 deploys the mitigation.
>  - Patch 10-11 fixes ON Vs AUTO mode.
> 
> v3: https://lore.kernel.org/r/20251027-vmscape-bhb-v3-0-5793c2534e93@linux.intel.com
> - s/x86_pred_flush_pending/x86_predictor_flush_exit_to_user/ (Sean).
> - Removed IBPB & BHB-clear mutual exclusion at exit-to-userspace.
> - Collected tags.
> 
> v2: https://lore.kernel.org/r/20251015-vmscape-bhb-v2-0-91cbdd9c3a96@linux.intel.com
> - Added check for IBPB feature in vmscape_select_mitigation(). (David)
> - s/vmscape=auto/vmscape=on/ (David)
> - Added patch to remove LFENCE from VMSCAPE BHB-clear sequence.
> - Rebased to v6.18-rc1.
> 
> v1: https://lore.kernel.org/r/20250924-vmscape-bhb-v1-0-da51f0e1934d@linux.intel.com
> 
> Hi All,
> 
> These patches aim to improve the performance of a recent mitigation for
> VMSCAPE[1] vulnerability. This improvement is relevant for BHI variant of
> VMSCAPE that affect Alder Lake and newer processors.
> 
> The current mitigation approach uses IBPB on kvm-exit-to-userspace for all
> affected range of CPUs. This is an overkill for CPUs that are only affected
> by the BHI variant. On such CPUs clearing the branch history is sufficient
> for VMSCAPE, and also more apt as the underlying issue is due to poisoned
> branch history.
> 
> Below is the iPerf data for transfer between guest and host, comparing IBPB
> and BHB-clear mitigation. BHB-clear shows performance improvement over IBPB
> in most cases.
> 
> Platform: Emerald Rapids
> Baseline: vmscape=off
> Target: IBPB at VMexit-to-userspace Vs the new BHB-clear at
> VMexit-to-userspace mitigation (both compared against baseline).
> 
> (pN = N parallel connections)
> 
> | iPerf user-net | IBPB    | BHB Clear |
> |----------------|---------|-----------|
> | UDP 1-vCPU_p1  | -12.5%  |   1.3%    |
> | TCP 1-vCPU_p1  | -10.4%  |  -1.5%    |
> | TCP 1-vCPU_p1  | -7.5%   |  -3.0%    |
> | UDP 4-vCPU_p16 | -3.7%   |  -3.7%    |
> | TCP 4-vCPU_p4  | -2.9%   |  -1.4%    |
> | UDP 4-vCPU_p4  | -0.6%   |   0.0%    |
> | TCP 4-vCPU_p4  |  3.5%   |   0.0%    |
> 
> | iPerf bridge-net | IBPB    | BHB Clear |
> |------------------|---------|-----------|
> | UDP 1-vCPU_p1    | -9.4%   |  -0.4%    |
> | TCP 1-vCPU_p1    | -3.9%   |  -0.5%    |
> | UDP 4-vCPU_p16   | -2.2%   |  -3.8%    |
> | TCP 4-vCPU_p4    | -1.0%   |  -1.0%    |
> | TCP 4-vCPU_p4    |  0.5%   |   0.5%    |
> | UDP 4-vCPU_p4    |  0.0%   |   0.9%    |
> | TCP 1-vCPU_p1    |  0.0%   |   0.9%    |
> 
> | iPerf vhost-net | IBPB    | BHB Clear |
> |-----------------|---------|-----------|
> | UDP 1-vCPU_p1   | -4.3%   |   1.0%    |
> | TCP 1-vCPU_p1   | -3.8%   |  -0.5%    |
> | TCP 1-vCPU_p1   | -2.7%   |  -0.7%    |
> | UDP 4-vCPU_p16  | -0.7%   |  -2.2%    |
> | TCP 4-vCPU_p4   | -0.4%   |   0.8%    |
> | UDP 4-vCPU_p4   |  0.4%   |  -0.7%    |
> | TCP 4-vCPU_p4   |  0.0%   |   0.6%    |
> 
> [1] https://comsec.ethz.ch/research/microarch/vmscape-exposing-and-exploiting-incomplete-branch-predictor-isolation-in-cloud-environments/
> ---
> Pawan Gupta (10):
>      x86/bhi: x86/vmscape: Move LFENCE out of clear_bhb_loop()
>      x86/bhi: Make clear_bhb_loop() effective on newer CPUs
>      x86/bhi: Rename clear_bhb_loop() to clear_bhb_loop_nofence()
>      x86/vmscape: Rename x86_ibpb_exit_to_user to x86_predictor_flush_exit_to_user
>      x86/vmscape: Move mitigation selection to a switch()
>      x86/vmscape: Use write_ibpb() instead of indirect_branch_prediction_barrier()
>      x86/vmscape: Use static_call() for predictor flush
>      x86/vmscape: Deploy BHB clearing mitigation
>      x86/vmscape: Resolve conflict between attack-vectors and vmscape=force
>      x86/vmscape: Add cmdline vmscape=on to override attack vector controls
> 
> Documentation/admin-guide/hw-vuln/vmscape.rst   | 15 ++++-
> Documentation/admin-guide/kernel-parameters.txt |  6 +-
> arch/x86/Kconfig                                |  1 +
> arch/x86/entry/entry_64.S                       | 34 +++++++----
> arch/x86/include/asm/cpufeatures.h              |  2 +-
> arch/x86/include/asm/entry-common.h             |  9 ++-
> arch/x86/include/asm/nospec-branch.h            | 13 +++--
> arch/x86/include/asm/processor.h                |  1 +
> arch/x86/kernel/cpu/bugs.c                      | 76 ++++++++++++++++++++-----
> arch/x86/kvm/x86.c                              |  4 +-
> arch/x86/net/bpf_jit_comp.c                     | 11 +---
> 11 files changed, 127 insertions(+), 45 deletions(-)
> ---
> base-commit: c369299895a591d96745d6492d4888259b004a9e
> change-id: 20250916-vmscape-bhb-d7d469977f2f
> 
> Best regards,
> --  
> Thanks,
> Pawan

Tested the v7 of this series with 6.18.y and one of our performance
suites, where we had previously bisected a significant regression to
the enablement of the VMSCAPE mitigation. This particular suite looks
at synthetic performance using KVM virtualized Windows guests.

Long story short, this suite tries to derive what end user experience
would be in these virtual machines while performing a standardized set
of synthetic tasks on real apps.

VMSCAPE hits especially hard when enabling Windows HVCI, which drives
a much higher VMExit count, all else equals. 

Tested on an Intel Xeon 6444Y (SPR)

TLDR, we're really happy with the results. The following was with 
Intel MBEC *enabled*, so even with that speedup (and drastic reduction
in VMExits), this optimization makes a significant difference.

- CPU‑ready time drops ~70 % across all steady‑state and log‑on metrics
with this series, indicating more efficient context switching even
though overall hypervisor CPU rises ~14 % (steady) to ~12 % (max).
Basically, we're getting more actual work done.
- Read/write IOPS increase by ~18–37 % and 14–20 % respectively, while
average IO latency remains largely unchanged or slightly lower in
steady metrics.
- Power consumption falls 5–11 % in every category
- Login times improve by 4–6 % on average.
- Application start‑up times are generally better (Word, Excel,
PowerPoint, Outlook), especially Outlook max time drops 67 %, a clear
win for end‑user experience.

Tested-By: Jon Kohler <jon@nutanix.com>

Re: [PATCH v8 00/10] VMSCAPE optimization for BHI variant
Posted by Pawan Gupta 3 days, 7 hours ago
On Mon, Mar 30, 2026 at 03:16:32AM +0000, Jon Kohler wrote:
> Tested the v7 of this series with 6.18.y and one of our performance
> suites, where we had previously bisected a significant regression to
> the enablement of the VMSCAPE mitigation. This particular suite looks
> at synthetic performance using KVM virtualized Windows guests.
> 
> Long story short, this suite tries to derive what end user experience
> would be in these virtual machines while performing a standardized set
> of synthetic tasks on real apps.
> 
> VMSCAPE hits especially hard when enabling Windows HVCI, which drives
> a much higher VMExit count, all else equals. 
> 
> Tested on an Intel Xeon 6444Y (SPR)
> 
> TLDR, we're really happy with the results. The following was with 
> Intel MBEC *enabled*, so even with that speedup (and drastic reduction
> in VMExits), this optimization makes a significant difference.
> 
> - CPU‑ready time drops ~70 % across all steady‑state and log‑on metrics
> with this series, indicating more efficient context switching even
> though overall hypervisor CPU rises ~14 % (steady) to ~12 % (max).
> Basically, we're getting more actual work done.
> - Read/write IOPS increase by ~18–37 % and 14–20 % respectively, while
> average IO latency remains largely unchanged or slightly lower in
> steady metrics.
> - Power consumption falls 5–11 % in every category
> - Login times improve by 4–6 % on average.
> - Application start‑up times are generally better (Word, Excel,
> PowerPoint, Outlook), especially Outlook max time drops 67 %, a clear
> win for end‑user experience.

These results are promising.

> Tested-By: Jon Kohler <jon@nutanix.com>

Thanks for testing, Jon.