[v4] x86/bugs: KVM: L1TF and MMIO Stale Data cleanups

[PATCH v4 0/8] x86/bugs: KVM: L1TF and MMIO Stale Data cleanups

Posted by Sean Christopherson 3 months, 1 week ago

This is a combination of Brendan's work to unify the L1TF L1D flushing
mitigation, and Pawan's work to bring some sanity to the mitigations that
clear CPU buffers, with a bunch of glue code and some polishing from me.

The "v4" is relative to the L1TF series.  I smushed the two series together
as Pawan's idea to clear CPU buffers for MMIO in vmenter.S obviated the need
for a separate cleanup/fix to have vmx_l1d_flush() return true/false, and
handling the series separately would have been a lot of work+churn for no
real benefit.

TL;DR:

 - Unify L1TF flushing under per-CPU variable
 - Bury L1TF L1D flushing under CONFIG_CPU_MITIGATIONS=y
 - Move MMIO Stale Data into asm, and do VERW at most once per VM-Enter

To allow VMX to use ALTERNATIVE_2 to select slightly different flows for doing
VERW, tweak the low lever macros in nospec-branch.h to define the instruction
sequence, and then wrap it with __stringify() as needed.

The non-VMX code is lightly tested (but there's far less chance for breakage
there).  For the VMX code, I verified it does what I want (which may or may
not be correct :-D) by hacking the code to force/clear various mitigations, and
using ud2 to confirm the right path got selected.

v4:
 - Drop the patch to fallback to handling the MMIO mitigation if
   vmx_l1d_flush() doesn't flush, and instead use Pawan's approach of
   decoupling the two entirely.
 - Replace the static branch with X86_FEATURE_CLEAR_CPU_BUF_MMIO so that
   it can be referenced in ALTERNATIVE macros.
 - Decouple X86_FEATURE_CLEAR_CPU_BUF_VM from X86_FEATURE_CLEAR_CPU_BUF_MMIO
   (though they still interact and can both be set)

v3:
 - https://lore.kernel.org/all/20251016200417.97003-1-seanjc@google.com
 - [Pawan's series] https://lore.kernel.org/all/20251029-verw-vm-v1-0-babf9b961519@linux.intel.com
 - Put the "raw" variant in KVM, dress it up with KVM's "request" terminology,
   and add a comment explaining why _KVM_ knows its usage doesn't need to
   disable virtualization.
 - Add the prep patches.

v2:
 - https://lore.kernel.org/all/20251015-b4-l1tf-percpu-v2-1-6d7a8d3d40e9@google.com
 - Moved the bit back to irq_stat
 - Fixed DEBUG_PREEMPT issues by adding a _raw variant

v1: https://lore.kernel.org/r/20251013-b4-l1tf-percpu-v1-1-d65c5366ea1a@google.com

Brendan Jackman (1):
  KVM: x86: Unify L1TF flushing under per-CPU variable

Pawan Gupta (1):
  x86/bugs: Use VM_CLEAR_CPU_BUFFERS in VMX as well

Sean Christopherson (6):
  x86/bugs: Decouple ALTERNATIVE usage from VERW macro definition
  x86/bugs: Use an X86_FEATURE_xxx flag for the MMIO Stale Data
    mitigation
  KVM: VMX: Handle MMIO Stale Data in VM-Enter assembly via
    ALTERNATIVES_2
  x86/bugs: KVM: Move VM_CLEAR_CPU_BUFFERS into SVM as
    SVM_CLEAR_CPU_BUFFERS
  KVM: VMX: Bundle all L1 data cache flush mitigation code together
  KVM: VMX: Disable L1TF L1 data cache flush if CONFIG_CPU_MITIGATIONS=n

 arch/x86/include/asm/cpufeatures.h   |   1 +
 arch/x86/include/asm/hardirq.h       |   4 +-
 arch/x86/include/asm/kvm_host.h      |   3 -
 arch/x86/include/asm/nospec-branch.h |  24 +--
 arch/x86/kernel/cpu/bugs.c           |  18 +-
 arch/x86/kvm/mmu/mmu.c               |   2 +-
 arch/x86/kvm/mmu/spte.c              |   2 +-
 arch/x86/kvm/svm/vmenter.S           |   6 +-
 arch/x86/kvm/vmx/nested.c            |   2 +-
 arch/x86/kvm/vmx/vmenter.S           |  14 +-
 arch/x86/kvm/vmx/vmx.c               | 235 ++++++++++++++-------------
 arch/x86/kvm/x86.c                   |   6 +-
 arch/x86/kvm/x86.h                   |  14 ++
 13 files changed, 178 insertions(+), 153 deletions(-)


base-commit: 4cc167c50eb19d44ac7e204938724e685e3d8057
-- 
2.51.1.930.gacf6e81ea2-goog

Re: [PATCH v4 0/8] x86/bugs: KVM: L1TF and MMIO Stale Data cleanups

Posted by Brendan Jackman 3 months, 1 week ago

On Fri Oct 31, 2025 at 12:30 AM UTC, Sean Christopherson wrote:
> This is a combination of Brendan's work to unify the L1TF L1D flushing
> mitigation, and Pawan's work to bring some sanity to the mitigations that
> clear CPU buffers, with a bunch of glue code and some polishing from me.
>
> The "v4" is relative to the L1TF series.  I smushed the two series together
> as Pawan's idea to clear CPU buffers for MMIO in vmenter.S obviated the need
> for a separate cleanup/fix to have vmx_l1d_flush() return true/false, and
> handling the series separately would have been a lot of work+churn for no
> real benefit.
>
> TL;DR:
>
>  - Unify L1TF flushing under per-CPU variable
>  - Bury L1TF L1D flushing under CONFIG_CPU_MITIGATIONS=y
>  - Move MMIO Stale Data into asm, and do VERW at most once per VM-Enter
>
> To allow VMX to use ALTERNATIVE_2 to select slightly different flows for doing
> VERW, tweak the low lever macros in nospec-branch.h to define the instruction
> sequence, and then wrap it with __stringify() as needed.
>
> The non-VMX code is lightly tested (but there's far less chance for breakage
> there).  For the VMX code, I verified it does what I want (which may or may
> not be correct :-D) by hacking the code to force/clear various mitigations, and
> using ud2 to confirm the right path got selected.

FWIW [0] offers a way to check end-to-end that an L1TF exploit is broken
by the mitigation. It's a bit of a long-winded way to achieve that and I
guess L1TF is anyway the easy case here, but I couldn't resist promoting
it.

(I just received a Skylake machine from ebay, once that's set up I'll be
able to double check on there that things still work).

Re: [PATCH v4 0/8] x86/bugs: KVM: L1TF and MMIO Stale Data cleanups

Posted by Sean Christopherson 3 months, 1 week ago

On Fri, Oct 31, 2025, Brendan Jackman wrote:
> On Fri Oct 31, 2025 at 12:30 AM UTC, Sean Christopherson wrote:
> > This is a combination of Brendan's work to unify the L1TF L1D flushing
> > mitigation, and Pawan's work to bring some sanity to the mitigations that
> > clear CPU buffers, with a bunch of glue code and some polishing from me.
> >
> > The "v4" is relative to the L1TF series.  I smushed the two series together
> > as Pawan's idea to clear CPU buffers for MMIO in vmenter.S obviated the need
> > for a separate cleanup/fix to have vmx_l1d_flush() return true/false, and
> > handling the series separately would have been a lot of work+churn for no
> > real benefit.
> >
> > TL;DR:
> >
> >  - Unify L1TF flushing under per-CPU variable
> >  - Bury L1TF L1D flushing under CONFIG_CPU_MITIGATIONS=y
> >  - Move MMIO Stale Data into asm, and do VERW at most once per VM-Enter
> >
> > To allow VMX to use ALTERNATIVE_2 to select slightly different flows for doing
> > VERW, tweak the low lever macros in nospec-branch.h to define the instruction
> > sequence, and then wrap it with __stringify() as needed.
> >
> > The non-VMX code is lightly tested (but there's far less chance for breakage
> > there).  For the VMX code, I verified it does what I want (which may or may
> > not be correct :-D) by hacking the code to force/clear various mitigations, and
> > using ud2 to confirm the right path got selected.
> 
> FWIW [0] offers a way to check end-to-end that an L1TF exploit is broken
> by the mitigation. It's a bit of a long-winded way to achieve that and I
> guess L1TF is anyway the easy case here, but I couldn't resist promoting
> it.

Yeah, it's on my radar, but it'll be a while before I have the bandwidth to dig
through something that involved (though I _am_ excited to have a way to actually
test mitigations).

Re: [PATCH v4 0/8] x86/bugs: KVM: L1TF and MMIO Stale Data cleanups

Posted by Brendan Jackman 3 months ago

On Fri Oct 31, 2025 at 5:36 PM UTC, Sean Christopherson wrote:
> On Fri, Oct 31, 2025, Brendan Jackman wrote:
>> On Fri Oct 31, 2025 at 12:30 AM UTC, Sean Christopherson wrote:
>> > This is a combination of Brendan's work to unify the L1TF L1D flushing
>> > mitigation, and Pawan's work to bring some sanity to the mitigations that
>> > clear CPU buffers, with a bunch of glue code and some polishing from me.
>> >
>> > The "v4" is relative to the L1TF series.  I smushed the two series together
>> > as Pawan's idea to clear CPU buffers for MMIO in vmenter.S obviated the need
>> > for a separate cleanup/fix to have vmx_l1d_flush() return true/false, and
>> > handling the series separately would have been a lot of work+churn for no
>> > real benefit.
>> >
>> > TL;DR:
>> >
>> >  - Unify L1TF flushing under per-CPU variable
>> >  - Bury L1TF L1D flushing under CONFIG_CPU_MITIGATIONS=y
>> >  - Move MMIO Stale Data into asm, and do VERW at most once per VM-Enter
>> >
>> > To allow VMX to use ALTERNATIVE_2 to select slightly different flows for doing
>> > VERW, tweak the low lever macros in nospec-branch.h to define the instruction
>> > sequence, and then wrap it with __stringify() as needed.
>> >
>> > The non-VMX code is lightly tested (but there's far less chance for breakage
>> > there).  For the VMX code, I verified it does what I want (which may or may
>> > not be correct :-D) by hacking the code to force/clear various mitigations, and
>> > using ud2 to confirm the right path got selected.
>> 
>> FWIW [0] offers a way to check end-to-end that an L1TF exploit is broken
>> by the mitigation. It's a bit of a long-winded way to achieve that and I
>> guess L1TF is anyway the easy case here, but I couldn't resist promoting
>> it.

Oops, for posterity, the missing [0] was:

[0]: https://lore.kernel.org/all/20251013-l1tf-test-v1-0-583fb664836d@google.com/

> Yeah, it's on my radar, but it'll be a while before I have the bandwidth to dig
> through something that involved (though I _am_ excited to have a way to actually
> test mitigations).

Also, I just realised I never mentioned anywhere: this is just the first
part, we also have an extension to the L1TF exploit to make it attack
via SMT. And we also have tests that exploit SRSO. Those will come later
though, I think there's no point in burning everyone out trying to get
everything in at once.