[PATCH v3 0/3] VMSCAPE optimization for BHI variant

Pawan Gupta posted 3 patches 3 months, 1 week ago
There is a newer version of this series
Documentation/admin-guide/hw-vuln/vmscape.rst   |  8 ++++
Documentation/admin-guide/kernel-parameters.txt |  4 +-
arch/x86/entry/entry_64.S                       | 63 ++++++++++++++++++-------
arch/x86/include/asm/cpufeatures.h              |  1 +
arch/x86/include/asm/entry-common.h             | 12 +++--
arch/x86/include/asm/nospec-branch.h            |  5 +-
arch/x86/kernel/cpu/bugs.c                      | 53 +++++++++++++++------
arch/x86/kvm/x86.c                              |  5 +-
8 files changed, 110 insertions(+), 41 deletions(-)
[PATCH v3 0/3] VMSCAPE optimization for BHI variant
Posted by Pawan Gupta 3 months, 1 week ago
v3:
- s/x86_pred_flush_pending/x86_predictor_flush_exit_to_user/ (Sean).
- Removed IBPB & BHB-clear mutual exclusion at exit-to-userspace.
- Collected tags.

v2: https://lore.kernel.org/r/20251015-vmscape-bhb-v2-0-91cbdd9c3a96@linux.intel.com
- Added check for IBPB feature in vmscape_select_mitigation(). (David)
- s/vmscape=auto/vmscape=on/ (David)
- Added patch to remove LFENCE from VMSCAPE BHB-clear sequence.
- Rebased to v6.18-rc1.

v1: https://lore.kernel.org/r/20250924-vmscape-bhb-v1-0-da51f0e1934d@linux.intel.com

Hi All,

These patches aim to improve the performance of a recent mitigation for
VMSCAPE[1] vulnerability. This improvement is relevant for BHI variant of
VMSCAPE that affect Alder Lake and newer processors.

The current mitigation approach uses IBPB on kvm-exit-to-userspace for all
affected range of CPUs. This is an overkill for CPUs that are only affected
by the BHI variant. On such CPUs clearing the branch history is sufficient
for VMSCAPE, and also more apt as the underlying issue is due to poisoned
branch history.

Roadmap:

- First patch introduces clear_bhb_long_loop() for processors with larger
  branch history tables.
- Second patch replaces IBPB on exit-to-userspace with branch history
  clearing sequence.

Below is the iPerf data for transfer between guest and host, comparing IBPB
and BHB-clear mitigation. BHB-clear shows performance improvement over IBPB
in most cases.

Platform: Emerald Rapids
Baseline: vmscape=off

(pN = N parallel connections)

| iPerf user-net | IBPB    | BHB Clear |
|----------------|---------|-----------|
| UDP 1-vCPU_p1  | -12.5%  |   1.3%    |
| TCP 1-vCPU_p1  | -10.4%  |  -1.5%    |
| TCP 1-vCPU_p1  | -7.5%   |  -3.0%    |
| UDP 4-vCPU_p16 | -3.7%   |  -3.7%    |
| TCP 4-vCPU_p4  | -2.9%   |  -1.4%    |
| UDP 4-vCPU_p4  | -0.6%   |   0.0%    |
| TCP 4-vCPU_p4  |  3.5%   |   0.0%    |

| iPerf bridge-net | IBPB    | BHB Clear |
|------------------|---------|-----------|
| UDP 1-vCPU_p1    | -9.4%   |  -0.4%    |
| TCP 1-vCPU_p1    | -3.9%   |  -0.5%    |
| UDP 4-vCPU_p16   | -2.2%   |  -3.8%    |
| TCP 4-vCPU_p4    | -1.0%   |  -1.0%    |
| TCP 4-vCPU_p4    |  0.5%   |   0.5%    |
| UDP 4-vCPU_p4    |  0.0%   |   0.9%    |
| TCP 1-vCPU_p1    |  0.0%   |   0.9%    |

| iPerf vhost-net | IBPB    | BHB Clear |
|-----------------|---------|-----------|
| UDP 1-vCPU_p1   | -4.3%   |   1.0%    |
| TCP 1-vCPU_p1   | -3.8%   |  -0.5%    |
| TCP 1-vCPU_p1   | -2.7%   |  -0.7%    |
| UDP 4-vCPU_p16  | -0.7%   |  -2.2%    |
| TCP 4-vCPU_p4   | -0.4%   |   0.8%    |
| UDP 4-vCPU_p4   |  0.4%   |  -0.7%    |
| TCP 4-vCPU_p4   |  0.0%   |   0.6%    |

[1] https://comsec.ethz.ch/research/microarch/vmscape-exposing-and-exploiting-incomplete-branch-predictor-isolation-in-cloud-environments/

---
Pawan Gupta (3):
      x86/bhi: Add BHB clearing for CPUs with larger branch history
      x86/vmscape: Replace IBPB with branch history clear on exit to userspace
      x86/vmscape: Remove LFENCE from BHB clearing long loop

 Documentation/admin-guide/hw-vuln/vmscape.rst   |  8 ++++
 Documentation/admin-guide/kernel-parameters.txt |  4 +-
 arch/x86/entry/entry_64.S                       | 63 ++++++++++++++++++-------
 arch/x86/include/asm/cpufeatures.h              |  1 +
 arch/x86/include/asm/entry-common.h             | 12 +++--
 arch/x86/include/asm/nospec-branch.h            |  5 +-
 arch/x86/kernel/cpu/bugs.c                      | 53 +++++++++++++++------
 arch/x86/kvm/x86.c                              |  5 +-
 8 files changed, 110 insertions(+), 41 deletions(-)
---
base-commit: fd57572253bc356330dbe5b233c2e1d8426c66fd
change-id: 20250916-vmscape-bhb-d7d469977f2f

Best regards,
-- 
Pawan
Re: [PATCH v3 0/3] VMSCAPE optimization for BHI variant
Posted by Dave Hansen 3 months ago
On 10/27/25 16:43, Pawan Gupta wrote:
> | iPerf user-net | IBPB    | BHB Clear |
> |----------------|---------|-----------|
> | UDP 1-vCPU_p1  | -12.5%  |   1.3%    |
...

Could you clarify what "1.3%" means? Is that relative to the baseline,
or relative to the IBPB number?

If it's relative to the baseline, then this data either looks wrong or
noisy since there are a lot of places where adding the BHB Clear loop
makes things faster.
Re: [PATCH v3 0/3] VMSCAPE optimization for BHI variant
Posted by Pawan Gupta 3 months ago
On Mon, Nov 03, 2025 at 12:07:30PM -0800, Dave Hansen wrote:
> On 10/27/25 16:43, Pawan Gupta wrote:
> > | iPerf user-net | IBPB    | BHB Clear |
> > |----------------|---------|-----------|
> > | UDP 1-vCPU_p1  | -12.5%  |   1.3%    |
> ...
> 
> Could you clarify what "1.3%" means? Is that relative to the baseline,
> or relative to the IBPB number?

This is relative to the baseline, sorry I didn't mention that explicitly.

> If it's relative to the baseline, then this data either looks wrong or
> noisy since there are a lot of places where adding the BHB Clear loop
> makes things faster.

I will double check, but I am fairly positive that this wasn't noisy.
Surprisingly, there were a few other cases where the BHB-clearing was
performing better than the baseline.