[PATCH 0/5] KVM: VMX: Fix MMIO Stale Data Mitigation

Sean Christopherson posted 5 patches 4 months, 4 weeks ago
arch/x86/include/asm/kvm_host.h |  3 +--
arch/x86/kvm/irq.c              |  9 +------
arch/x86/kvm/mmu/mmu_internal.h |  3 +++
arch/x86/kvm/mmu/spte.c         | 43 ++++++++++++++++++++++++++++++---
arch/x86/kvm/mmu/spte.h         | 10 ++++++++
arch/x86/kvm/vmx/run_flags.h    | 10 +++++---
arch/x86/kvm/vmx/vmx.c          |  8 +++++-
arch/x86/kvm/x86.c              | 18 --------------
include/linux/kvm_host.h        | 18 --------------
virt/kvm/vfio.c                 |  3 ---
10 files changed, 68 insertions(+), 57 deletions(-)
[PATCH 0/5] KVM: VMX: Fix MMIO Stale Data Mitigation
Posted by Sean Christopherson 4 months, 4 weeks ago
Fix KVM's mitigation of the MMIO Stale Data bug, as the current approach
doesn't actually detect whether or not a guest has access to MMIO.  E.g.
KVM_DEV_VFIO_FILE_ADD is entirely optional, and obviously only covers VFIO
devices, and so is a terrible heuristic for "can this vCPU access MMIO?"

To fix the flaw (hopefully), track whether or not a vCPU has access to MMIO
based on the MMU it will run with.  KVM already detects host MMIO when
installing PTEs in order to force host MMIO to UC (EPT bypasses MTRRs), so
feeding that information into the MMU is rather straightforward.

Note, I haven't actually verified this mitigates the MMIO Stale Data bug, but
I think it's safe to say no has verified the existing code works either.

All that said, and despite what the subject says, my real interest in this
series it to kill off kvm_arch_{start,end}_assignment().  I.e. preciesly
identifying MMIO is a means to an end.  Because as evidenced by the MMIO mess
and other bugs (e.g. vDPA device not getting device posted interrupts),
keying off KVM_DEV_VFIO_FILE_ADD for anything is a bad idea.

The last two patches of this series depend on the stupidly large device
posted interrupts rework:

  https://lore.kernel.org/all/20250523010004.3240643-1-seanjc@google.com

which in turn depends on a not-tiny prep series:

  https://lore.kernel.org/all/20250519232808.2745331-1-seanjc@google.com

Unless you care deeply about those patches, I honestly recommend just ignoring
them.  I posted them as part of this series, because post two patches that
depends on *four* series seemed even more ridiculousr :-)

Side topic: Pawan, I haven't forgotten about your mmio_stale_data_clear =>
cpu_buf_vm_clear rename, I promise I'll review it soon.

Sean Christopherson (5):
  KVM: x86: Avoid calling kvm_is_mmio_pfn() when kvm_x86_ops.get_mt_mask
    is NULL
  KVM: x86/mmu: Locally cache whether a PFN is host MMIO when making a
    SPTE
  KVM: VMX: Apply MMIO Stale Data mitigation if KVM maps MMIO into the
    guest
  Revert "kvm: detect assigned device via irqbypass manager"
  VFIO: KVM: x86: Drop kvm_arch_{start,end}_assignment()

 arch/x86/include/asm/kvm_host.h |  3 +--
 arch/x86/kvm/irq.c              |  9 +------
 arch/x86/kvm/mmu/mmu_internal.h |  3 +++
 arch/x86/kvm/mmu/spte.c         | 43 ++++++++++++++++++++++++++++++---
 arch/x86/kvm/mmu/spte.h         | 10 ++++++++
 arch/x86/kvm/vmx/run_flags.h    | 10 +++++---
 arch/x86/kvm/vmx/vmx.c          |  8 +++++-
 arch/x86/kvm/x86.c              | 18 --------------
 include/linux/kvm_host.h        | 18 --------------
 virt/kvm/vfio.c                 |  3 ---
 10 files changed, 68 insertions(+), 57 deletions(-)


base-commit: 1f0486097459e53d292db749de70e587339267f5
-- 
2.49.0.1151.ga128411c76-goog
Re: [PATCH 0/5] KVM: VMX: Fix MMIO Stale Data Mitigation
Posted by Sean Christopherson 3 months, 3 weeks ago
On Thu, 22 May 2025 18:17:51 -0700, Sean Christopherson wrote:
> Fix KVM's mitigation of the MMIO Stale Data bug, as the current approach
> doesn't actually detect whether or not a guest has access to MMIO.  E.g.
> KVM_DEV_VFIO_FILE_ADD is entirely optional, and obviously only covers VFIO
> devices, and so is a terrible heuristic for "can this vCPU access MMIO?"
> 
> To fix the flaw (hopefully), track whether or not a vCPU has access to MMIO
> based on the MMU it will run with.  KVM already detects host MMIO when
> installing PTEs in order to force host MMIO to UC (EPT bypasses MTRRs), so
> feeding that information into the MMU is rather straightforward.
> 
> [...]

Applied 1-3 to kvm-x86 mmio, and 4-5 to 'kvm-x86 no_assignment' (which is based
on 'irqs' and includes 'mmio' via a merge, to avoid having the mmio changes
depend on the IRQ overhaul).

[1/5] KVM: x86: Avoid calling kvm_is_mmio_pfn() when kvm_x86_ops.get_mt_mask is NULL
      https://github.com/kvm-x86/linux/commit/c126b46e6fa8
[2/5] KVM: x86/mmu: Locally cache whether a PFN is host MMIO when making a SPTE
      https://github.com/kvm-x86/linux/commit/ffe9d7966d01
[3/5] KVM: VMX: Apply MMIO Stale Data mitigation if KVM maps MMIO into the guest
      https://github.com/kvm-x86/linux/commit/83ebe7157483
[4/5] Revert "kvm: detect assigned device via irqbypass manager"
      https://github.com/kvm-x86/linux/commit/ff845e6a84c8
[5/5] VFIO: KVM: x86: Drop kvm_arch_{start,end}_assignment()
      https://github.com/kvm-x86/linux/commit/bbc13ae593e0

--
https://github.com/kvm-x86/kvm-unit-tests/tree/next
Re: [PATCH 0/5] KVM: VMX: Fix MMIO Stale Data Mitigation
Posted by Pawan Gupta 4 months, 3 weeks ago
On Thu, May 22, 2025 at 06:17:51PM -0700, Sean Christopherson wrote:
> Fix KVM's mitigation of the MMIO Stale Data bug, as the current approach
> doesn't actually detect whether or not a guest has access to MMIO.  E.g.
> KVM_DEV_VFIO_FILE_ADD is entirely optional, and obviously only covers VFIO

I believe this needs userspace co-operation?

> devices, and so is a terrible heuristic for "can this vCPU access MMIO?"
> 
> To fix the flaw (hopefully), track whether or not a vCPU has access to MMIO
> based on the MMU it will run with.  KVM already detects host MMIO when
> installing PTEs in order to force host MMIO to UC (EPT bypasses MTRRs), so
> feeding that information into the MMU is rather straightforward.
> 
> Note, I haven't actually verified this mitigates the MMIO Stale Data bug, but
> I think it's safe to say no has verified the existing code works either.

Mitigation was verifed for VFIO devices, but ofcourse not for the cases you
mentioned above. Typically, it is the PCI config registers on some faulty
devices (that don't respect byte-enable) are subject to MMIO Stale Data.
But, it is impossible to test and confirm with absolute certainity that all
other cases are not affected. Your patches should rule out those cases as
well.

Regarding validating this, if VERW is executed at VMenter, mitigation was
found to be effective. This is similar to other bugs like MDS. I am not a
virtualization expert, but I will try to validate whatever I can.

> All that said, and despite what the subject says, my real interest in this
> series it to kill off kvm_arch_{start,end}_assignment().  I.e. preciesly
> identifying MMIO is a means to an end.  Because as evidenced by the MMIO mess
> and other bugs (e.g. vDPA device not getting device posted interrupts),
> keying off KVM_DEV_VFIO_FILE_ADD for anything is a bad idea.
> 
> The last two patches of this series depend on the stupidly large device
> posted interrupts rework:
> 
>   https://lore.kernel.org/all/20250523010004.3240643-1-seanjc@google.com
> 
> which in turn depends on a not-tiny prep series:
> 
>   https://lore.kernel.org/all/20250519232808.2745331-1-seanjc@google.com
> 
> Unless you care deeply about those patches, I honestly recommend just ignoring
> them.  I posted them as part of this series, because post two patches that
> depends on *four* series seemed even more ridiculousr :-)
> 
> Side topic: Pawan, I haven't forgotten about your mmio_stale_data_clear =>
> cpu_buf_vm_clear rename, I promise I'll review it soon.

No problem.
Re: [PATCH 0/5] KVM: VMX: Fix MMIO Stale Data Mitigation
Posted by Sean Christopherson 4 months, 2 weeks ago
On Wed, May 28, 2025, Pawan Gupta wrote:
> On Thu, May 22, 2025 at 06:17:51PM -0700, Sean Christopherson wrote:
> > Fix KVM's mitigation of the MMIO Stale Data bug, as the current approach
> > doesn't actually detect whether or not a guest has access to MMIO.  E.g.
> > KVM_DEV_VFIO_FILE_ADD is entirely optional, and obviously only covers VFIO
> 
> I believe this needs userspace co-operation?

Yes, more or less.  If the userspace VMM knows it doesn't need to trigger the
side effects of KVM_DEV_VFIO_FILE_ADD (e.g. isn't dealing with non-coherent DMA),
and doesn't need the VFIO<=>KVM binding (e.g. for KVM-GT), then AFAIK it's safe
to skip KVM_DEV_VFIO_FILE_ADD, modulo this mitigation.

> > devices, and so is a terrible heuristic for "can this vCPU access MMIO?"
> > 
> > To fix the flaw (hopefully), track whether or not a vCPU has access to MMIO
> > based on the MMU it will run with.  KVM already detects host MMIO when
> > installing PTEs in order to force host MMIO to UC (EPT bypasses MTRRs), so
> > feeding that information into the MMU is rather straightforward.
> > 
> > Note, I haven't actually verified this mitigates the MMIO Stale Data bug, but
> > I think it's safe to say no has verified the existing code works either.
> 
> Mitigation was verifed for VFIO devices, but ofcourse not for the cases you
> mentioned above. Typically, it is the PCI config registers on some faulty
> devices (that don't respect byte-enable) are subject to MMIO Stale Data.
>
> But, it is impossible to test and confirm with absolute certainity that all

Yeah, no argument there.  

> other cases are not affected. Your patches should rule out those cases as
> well.
> 
> Regarding validating this, if VERW is executed at VMenter, mitigation was
> found to be effective. This is similar to other bugs like MDS. I am not a
> virtualization expert, but I will try to validate whatever I can.

If you can re-verify the mitigation works for VFIO devices, that's more than
good enough for me.  The bar at this point is to not regress the existing mitigation,
anything beyond that is gravy.

I've verified the KVM mechanics of tracing MMIO mappings fairly well (famous last
words), the only thing I haven't sanity checked is that the existing coverage for
VFIO devices is maintained.
Re: [PATCH 0/5] KVM: VMX: Fix MMIO Stale Data Mitigation
Posted by Pawan Gupta 4 months, 2 weeks ago
On Mon, Jun 02, 2025 at 04:41:35PM -0700, Sean Christopherson wrote:
> > Regarding validating this, if VERW is executed at VMenter, mitigation was
> > found to be effective. This is similar to other bugs like MDS. I am not a
> > virtualization expert, but I will try to validate whatever I can.
> 
> If you can re-verify the mitigation works for VFIO devices, that's more than
> good enough for me.  The bar at this point is to not regress the existing mitigation,
> anything beyond that is gravy.

Ok sure. I'll verify that VERW is getting executed for VFIO devices.

> I've verified the KVM mechanics of tracing MMIO mappings fairly well (famous last
> words), the only thing I haven't sanity checked is that the existing coverage for
> VFIO devices is maintained.
Re: [PATCH 0/5] KVM: VMX: Fix MMIO Stale Data Mitigation
Posted by Pawan Gupta 4 months, 2 weeks ago
On Mon, Jun 02, 2025 at 06:22:08PM -0700, Pawan Gupta wrote:
> On Mon, Jun 02, 2025 at 04:41:35PM -0700, Sean Christopherson wrote:
> > > Regarding validating this, if VERW is executed at VMenter, mitigation was
> > > found to be effective. This is similar to other bugs like MDS. I am not a
> > > virtualization expert, but I will try to validate whatever I can.
> > 
> > If you can re-verify the mitigation works for VFIO devices, that's more than
> > good enough for me.  The bar at this point is to not regress the existing mitigation,
> > anything beyond that is gravy.
> 
> Ok sure. I'll verify that VERW is getting executed for VFIO devices.

I have verified that with below patches CPU buffer clearing for MMIO Stale
Data is working as expected for VFIO device.

  KVM: VMX: Apply MMIO Stale Data mitigation if KVM maps MMIO into the guest
  KVM: x86/mmu: Locally cache whether a PFN is host MMIO when making a SPTE
  KVM: x86: Avoid calling kvm_is_mmio_pfn() when kvm_x86_ops.get_mt_mask is NULL

For the above patches:

Tested-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>

Below are excerpts from the logs with debug prints added:

# virsh start ubuntu24.04                                                      <------ Guest launched
[ 5737.281649] virbr0: port 1(vnet1) entered blocking state
[ 5737.281659] virbr0: port 1(vnet1) entered disabled state
[ 5737.281686] vnet1: entered allmulticast mode
[ 5737.281775] vnet1: entered promiscuous mode
[ 5737.282026] virbr0: port 1(vnet1) entered blocking state
[ 5737.282032] virbr0: port 1(vnet1) entered listening state
[ 5737.775162] vmx_vcpu_enter_exit: 13085 callbacks suppressed
[ 5737.775169] kvm_intel: vmx_vcpu_enter_exit: CPU buffer NOT cleared for MMIO  <----- Buffers not cleared
[ 5737.775192] kvm_intel: vmx_vcpu_enter_exit: CPU buffer NOT cleared for MMIO
[ 5737.775203] kvm_intel: vmx_vcpu_enter_exit: CPU buffer NOT cleared for MMIO
...
Domain 'ubuntu24.04' started

[ 5739.323529] virbr0: port 1(vnet1) entered learning state
[ 5741.372527] virbr0: port 1(vnet1) entered forwarding state
[ 5741.372540] virbr0: topology change detected, propagating
[ 5742.906218] kvm_intel: vmx_vcpu_enter_exit: CPU buffer NOT cleared for MMIO
[ 5742.906232] kvm_intel: vmx_vcpu_enter_exit: CPU buffer NOT cleared for MMIO
[ 5742.906234] kvm_intel: vmx_vcpu_enter_exit: CPU buffer NOT cleared for MMIO
[ 5747.906515] vmx_vcpu_enter_exit: 267825 callbacks suppressed
...

# virsh attach-device ubuntu24.04 vfio.xml  --live                            <----- Device attached

[ 5749.913996] ioatdma 0000:00:01.1: Removing dma and dca services
[ 5750.786112] vfio-pci 0000:00:01.1: resetting
[ 5750.891646] vfio-pci 0000:00:01.1: reset done
[ 5750.900521] vfio-pci 0000:00:01.1: resetting
[ 5751.003645] vfio-pci 0000:00:01.1: reset done
Device attached successfully
[ 5751.074292] kvm_intel: vmx_vcpu_enter_exit: CPU buffer cleared for MMIO    <----- Buffers getting cleared
[ 5751.074293] kvm_intel: vmx_vcpu_enter_exit: CPU buffer cleared for MMIO
[ 5751.074294] kvm_intel: vmx_vcpu_enter_exit: CPU buffer cleared for MMIO
[ 5756.076427] vmx_vcpu_enter_exit: 68991 callbacks suppressed