[PATCH] KVM: x86: preserve interrupt shadow across SMM entries

Maxim Levitsky posted 1 patch 3 years, 10 months ago
arch/x86/include/asm/kvm_host.h |  3 +++
arch/x86/kvm/x86.c              | 28 +++++++++++++++++++++++++---
2 files changed, 28 insertions(+), 3 deletions(-)
[PATCH] KVM: x86: preserve interrupt shadow across SMM entries
Posted by Maxim Levitsky 3 years, 10 months ago
If the #SMI happens while the vCPU is in the interrupt shadow,
(after STI or MOV SS),
we must both clear it to avoid VM entry failure on VMX,
due to consistency check vs EFLAGS.IF which is cleared on SMM entries,
and restore it on RSM so that #SMI is transparent to the non SMM code.

To support migration, reuse upper 4 bits of
'kvm_vcpu_events.interrupt.shadow' to store the smm interrupt shadow.

This was lightly tested with a linux guest and smm load script,
and a unit test will be soon developed to test this better.

For discussion: there are other ways to fix this issue:

1. The SMM shadow can be stored in SMRAM at some unused
offset, this will allow to avoid changes to kvm_vcpu_ioctl_x86_set_vcpu_events

2. #SMI can instead be blocked while the interrupt shadow is active,
which might even be what the real CPU does, however since neither VMX
nor SVM support SMM window handling, this will involve single stepping
the guest like it is currently done on SVM for the NMI window in some cases.

What do you think?

Also this might need a new KVM cap, I am not sure about it.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 arch/x86/include/asm/kvm_host.h |  3 +++
 arch/x86/kvm/x86.c              | 28 +++++++++++++++++++++++++---
 2 files changed, 28 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 6cf5d77d78969..4ee14dc79646f 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -939,6 +939,9 @@ struct kvm_vcpu_arch {
 	 */
 	bool pdptrs_from_userspace;
 
+	/* saved interrupt shadow on smm entry */
+	u8 smm_interrupt_shadow;
+
 #if IS_ENABLED(CONFIG_HYPERV)
 	hpa_t hv_root_tdp;
 #endif
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 2db6f0373fa3f..28d736b74eec6 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4915,6 +4915,8 @@ static void kvm_vcpu_ioctl_x86_get_vcpu_events(struct kvm_vcpu *vcpu,
 	events->interrupt.soft = 0;
 	events->interrupt.shadow = static_call(kvm_x86_get_interrupt_shadow)(vcpu);
 
+	events->interrupt.shadow |= vcpu->arch.smm_interrupt_shadow << 4;
+
 	events->nmi.injected = vcpu->arch.nmi_injected;
 	events->nmi.pending = vcpu->arch.nmi_pending != 0;
 	events->nmi.masked = static_call(kvm_x86_get_nmi_mask)(vcpu);
@@ -4988,9 +4990,6 @@ static int kvm_vcpu_ioctl_x86_set_vcpu_events(struct kvm_vcpu *vcpu,
 	vcpu->arch.interrupt.injected = events->interrupt.injected;
 	vcpu->arch.interrupt.nr = events->interrupt.nr;
 	vcpu->arch.interrupt.soft = events->interrupt.soft;
-	if (events->flags & KVM_VCPUEVENT_VALID_SHADOW)
-		static_call(kvm_x86_set_interrupt_shadow)(vcpu,
-						events->interrupt.shadow);
 
 	vcpu->arch.nmi_injected = events->nmi.injected;
 	if (events->flags & KVM_VCPUEVENT_VALID_NMI_PENDING)
@@ -5024,6 +5023,14 @@ static int kvm_vcpu_ioctl_x86_set_vcpu_events(struct kvm_vcpu *vcpu,
 		}
 	}
 
+	if (events->flags & KVM_VCPUEVENT_VALID_SHADOW) {
+		static_call(kvm_x86_set_interrupt_shadow)(vcpu, events->interrupt.shadow & 0xF);
+		if (events->flags & KVM_VCPUEVENT_VALID_SMM)
+			vcpu->arch.smm_interrupt_shadow = events->interrupt.shadow >> 4;
+		else
+			vcpu->arch.smm_interrupt_shadow = 0;
+	}
+
 	if (events->flags & KVM_VCPUEVENT_VALID_TRIPLE_FAULT) {
 		if (!vcpu->kvm->arch.triple_fault_event)
 			return -EINVAL;
@@ -8282,6 +8289,15 @@ static void kvm_smm_changed(struct kvm_vcpu *vcpu, bool entering_smm)
 
 	if (entering_smm) {
 		vcpu->arch.hflags |= HF_SMM_MASK;
+
+		/* Stash aside current value of the interrupt shadow
+		 * and reset it on the entry to the SMM
+		 */
+		vcpu->arch.smm_interrupt_shadow =
+				static_call(kvm_x86_get_interrupt_shadow)(vcpu);
+
+		static_call(kvm_x86_set_interrupt_shadow)(vcpu, 0);
+
 	} else {
 		vcpu->arch.hflags &= ~(HF_SMM_MASK | HF_SMM_INSIDE_NMI_MASK);
 
@@ -8294,6 +8310,12 @@ static void kvm_smm_changed(struct kvm_vcpu *vcpu, bool entering_smm)
 		 * guest memory
 		 */
 		vcpu->arch.pdptrs_from_userspace = false;
+
+		/* Restore interrupt shadow to its pre-SMM value */
+		static_call(kvm_x86_set_interrupt_shadow)(vcpu,
+					vcpu->arch.smm_interrupt_shadow);
+
+		vcpu->arch.smm_interrupt_shadow = 0;
 	}
 
 	kvm_mmu_reset_context(vcpu);
-- 
2.31.1
Re: [PATCH] KVM: x86: preserve interrupt shadow across SMM entries
Posted by Paolo Bonzini 3 years, 10 months ago
On 6/7/22 17:16, Maxim Levitsky wrote:
> If the #SMI happens while the vCPU is in the interrupt shadow,
> (after STI or MOV SS),
> we must both clear it to avoid VM entry failure on VMX,
> due to consistency check vs EFLAGS.IF which is cleared on SMM entries,
> and restore it on RSM so that #SMI is transparent to the non SMM code.
> 
> To support migration, reuse upper 4 bits of
> 'kvm_vcpu_events.interrupt.shadow' to store the smm interrupt shadow.
> 
> This was lightly tested with a linux guest and smm load script,
> and a unit test will be soon developed to test this better.
> 
> For discussion: there are other ways to fix this issue:
> 
> 1. The SMM shadow can be stored in SMRAM at some unused
> offset, this will allow to avoid changes to kvm_vcpu_ioctl_x86_set_vcpu_events

Yes, that would be better (and would not require a new cap).

Paolo
Re: [PATCH] KVM: x86: preserve interrupt shadow across SMM entries
Posted by Sean Christopherson 3 years, 10 months ago
On Tue, Jun 07, 2022, Paolo Bonzini wrote:
> On 6/7/22 17:16, Maxim Levitsky wrote:
> > If the #SMI happens while the vCPU is in the interrupt shadow,
> > (after STI or MOV SS),
> > we must both clear it to avoid VM entry failure on VMX,
> > due to consistency check vs EFLAGS.IF which is cleared on SMM entries,
> > and restore it on RSM so that #SMI is transparent to the non SMM code.
> > 
> > To support migration, reuse upper 4 bits of
> > 'kvm_vcpu_events.interrupt.shadow' to store the smm interrupt shadow.
> > 
> > This was lightly tested with a linux guest and smm load script,
> > and a unit test will be soon developed to test this better.
> > 
> > For discussion: there are other ways to fix this issue:
> > 
> > 1. The SMM shadow can be stored in SMRAM at some unused
> > offset, this will allow to avoid changes to kvm_vcpu_ioctl_x86_set_vcpu_events
> 
> Yes, that would be better (and would not require a new cap).

At one point do we chalk up SMM emulation as a failed experiment and deprecate
support?  There are most definitely more bugs lurking in KVM's handling of
save/restore across SMI+RSM.

> > 2. #SMI can instead be blocked while the interrupt shadow is active,
> > which might even be what the real CPU does, however since neither VMX
> > nor SVM support SMM window handling, this will involve single stepping
> > the guest like it is currently done on SVM for the NMI window in some cases.

FWIW, blocking SMI in STI/MOVSS shadows is explicitly allowed by the Intel SDM.
IIRC, modern Intel CPUs block SMIs in MOVSS shadows but not STI shadows.
Re: [PATCH] KVM: x86: preserve interrupt shadow across SMM entries
Posted by Maxim Levitsky 3 years, 10 months ago
On Tue, 2022-06-07 at 19:22 +0000, Sean Christopherson wrote:
> On Tue, Jun 07, 2022, Paolo Bonzini wrote:
> > On 6/7/22 17:16, Maxim Levitsky wrote:
> > > If the #SMI happens while the vCPU is in the interrupt shadow,
> > > (after STI or MOV SS),
> > > we must both clear it to avoid VM entry failure on VMX,
> > > due to consistency check vs EFLAGS.IF which is cleared on SMM entries,
> > > and restore it on RSM so that #SMI is transparent to the non SMM code.
> > > 
> > > To support migration, reuse upper 4 bits of
> > > 'kvm_vcpu_events.interrupt.shadow' to store the smm interrupt shadow.
> > > 
> > > This was lightly tested with a linux guest and smm load script,
> > > and a unit test will be soon developed to test this better.
> > > 
> > > For discussion: there are other ways to fix this issue:
> > > 
> > > 1. The SMM shadow can be stored in SMRAM at some unused
> > > offset, this will allow to avoid changes to kvm_vcpu_ioctl_x86_set_vcpu_events
> > 
> > Yes, that would be better (and would not require a new cap).
> 
> At one point do we chalk up SMM emulation as a failed experiment and deprecate
> support?  There are most definitely more bugs lurking in KVM's handling of
> save/restore across SMI+RSM.

I also kind of agree that SMM was kind of a mistake but these days VMs with secure
boot use it, so we can't stop supporting this.

So do you also agree that I write the interrupt shadow to smram?

Best regards,
	Maxim Levitsky

> 
> > > 2. #SMI can instead be blocked while the interrupt shadow is active,
> > > which might even be what the real CPU does, however since neither VMX
> > > nor SVM support SMM window handling, this will involve single stepping
> > > the guest like it is currently done on SVM for the NMI window in some cases.
> 
> FWIW, blocking SMI in STI/MOVSS shadows is explicitly allowed by the Intel SDM.
> IIRC, modern Intel CPUs block SMIs in MOVSS shadows but not STI shadows.
>
Re: [PATCH] KVM: x86: preserve interrupt shadow across SMM entries
Posted by Sean Christopherson 3 years, 10 months ago
On Wed, Jun 08, 2022, Maxim Levitsky wrote:
> On Tue, 2022-06-07 at 19:22 +0000, Sean Christopherson wrote:
> > On Tue, Jun 07, 2022, Paolo Bonzini wrote:
> > > On 6/7/22 17:16, Maxim Levitsky wrote:
> > > > If the #SMI happens while the vCPU is in the interrupt shadow,
> > > > (after STI or MOV SS),
> > > > we must both clear it to avoid VM entry failure on VMX,
> > > > due to consistency check vs EFLAGS.IF which is cleared on SMM entries,
> > > > and restore it on RSM so that #SMI is transparent to the non SMM code.
> > > > 
> > > > To support migration, reuse upper 4 bits of
> > > > 'kvm_vcpu_events.interrupt.shadow' to store the smm interrupt shadow.
> > > > 
> > > > This was lightly tested with a linux guest and smm load script,
> > > > and a unit test will be soon developed to test this better.
> > > > 
> > > > For discussion: there are other ways to fix this issue:
> > > > 
> > > > 1. The SMM shadow can be stored in SMRAM at some unused
> > > > offset, this will allow to avoid changes to kvm_vcpu_ioctl_x86_set_vcpu_events
> > > 
> > > Yes, that would be better (and would not require a new cap).
> > 
> > At one point do we chalk up SMM emulation as a failed experiment and deprecate
> > support?  There are most definitely more bugs lurking in KVM's handling of
> > save/restore across SMI+RSM.
> 
> I also kind of agree that SMM was kind of a mistake but these days VMs with secure
> boot use it, so we can't stop supporting this.

Ugh, found the KVM forum presentation. That's unfortunate :-(

> So do you also agree that I write the interrupt shadow to smram?

Yep, unless we want to block SMIs in shadows, which I don't think is allowed by
AMD's architecture.  Using a micro-architecture specific field in SMRAM is how
actual silicon would preserve the state.