KVM: x86: forcibly leave SMM mode on vCPU reset

[PATCH] KVM: x86: forcibly leave SMM mode on vCPU reset

Posted by Mikhail Lobanov 10 months, 2 weeks ago

Previously, commit ed129ec9057f ("KVM: x86: forcibly leave nested mode
on vCPU reset") addressed an issue where a triple fault occurring in
nested mode could lead to use-after-free scenarios. However, the commit
did not handle the analogous situation for System Management Mode (SMM).

This omission results in triggering a WARN when a vCPU reset occurs
while still in SMM mode, due to the check in kvm_vcpu_reset(). This
situation was reprodused using Syzkaller by:
1) Creating a KVM VM and vCPU
2) Sending a KVM_SMI ioctl to explicitly enter SMM
3) Executing invalid instructions causing consecutive exceptions and
eventually a triple fault

The issue manifests as follows:

WARNING: CPU: 0 PID: 25506 at arch/x86/kvm/x86.c:12112
kvm_vcpu_reset+0x1d2/0x1530 arch/x86/kvm/x86.c:12112
Modules linked in:
CPU: 0 PID: 25506 Comm: syz-executor.0 Not tainted
6.1.130-syzkaller-00157-g164fe5dde9b6 #0
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.12.0-1 04/01/2014
RIP: 0010:kvm_vcpu_reset+0x1d2/0x1530 arch/x86/kvm/x86.c:12112
Call Trace:
 <TASK>
 shutdown_interception+0x66/0xb0 arch/x86/kvm/svm/svm.c:2136
 svm_invoke_exit_handler+0x110/0x530 arch/x86/kvm/svm/svm.c:3395
 svm_handle_exit+0x424/0x920 arch/x86/kvm/svm/svm.c:3457
 vcpu_enter_guest arch/x86/kvm/x86.c:10959 [inline]
 vcpu_run+0x2c43/0x5a90 arch/x86/kvm/x86.c:11062
 kvm_arch_vcpu_ioctl_run+0x50f/0x1cf0 arch/x86/kvm/x86.c:11283
 kvm_vcpu_ioctl+0x570/0xf00 arch/x86/kvm/../../../virt/kvm/kvm_main.c:4122
 vfs_ioctl fs/ioctl.c:51 [inline]
 __do_sys_ioctl fs/ioctl.c:870 [inline]
 __se_sys_ioctl fs/ioctl.c:856 [inline]
 __x64_sys_ioctl+0x19a/0x210 fs/ioctl.c:856
 do_syscall_x64 arch/x86/entry/common.c:51 [inline]
 do_syscall_64+0x35/0x80 arch/x86/entry/common.c:81
 entry_SYSCALL_64_after_hwframe+0x6e/0xd8

Considering that hardware CPUs exit SMM mode completely upon receiving
a triple fault by triggering a hardware reset (which inherently leads
to exiting SMM), explicitly perform SMM exit prior to the WARN check.
Although subsequent code clears vCPU hflags, including the SMM flag,
calling kvm_smm_changed ensures the exit from SMM is handled correctly
and explicitly, aligning precisely with hardware behavior.


Found by Linux Verification Center (linuxtesting.org) with Syzkaller.

Fixes: ed129ec9057f ("KVM: x86: forcibly leave nested mode on vCPU reset")
Cc: stable@vger.kernel.org
Signed-off-by: Mikhail Lobanov <m.lobanov@rosa.ru>
---
 arch/x86/kvm/x86.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 4b64ab350bcd..f1c95c21703a 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -12409,6 +12409,9 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
 	if (is_guest_mode(vcpu))
 		kvm_leave_nested(vcpu);
 
+	if (is_smm(vcpu))
+		kvm_smm_changed(vcpu, false);
+
 	kvm_lapic_reset(vcpu, init_event);
 
 	WARN_ON_ONCE(is_guest_mode(vcpu) || is_smm(vcpu));
-- 
2.47.2

Re: [PATCH] KVM: x86: forcibly leave SMM mode on vCPU reset

Posted by Sean Christopherson 10 months ago

On Mon, Mar 24, 2025, Mikhail Lobanov wrote:
> Previously, commit ed129ec9057f ("KVM: x86: forcibly leave nested mode
> on vCPU reset") addressed an issue where a triple fault occurring in
> nested mode could lead to use-after-free scenarios. However, the commit
> did not handle the analogous situation for System Management Mode (SMM).
> 
> This omission results in triggering a WARN when a vCPU reset occurs
> while still in SMM mode, due to the check in kvm_vcpu_reset(). This
> situation was reprodused using Syzkaller by:
> 1) Creating a KVM VM and vCPU
> 2) Sending a KVM_SMI ioctl to explicitly enter SMM
> 3) Executing invalid instructions causing consecutive exceptions and
> eventually a triple fault
> 
> The issue manifests as follows:
> 
> WARNING: CPU: 0 PID: 25506 at arch/x86/kvm/x86.c:12112
> kvm_vcpu_reset+0x1d2/0x1530 arch/x86/kvm/x86.c:12112
> Modules linked in:
> CPU: 0 PID: 25506 Comm: syz-executor.0 Not tainted
> 6.1.130-syzkaller-00157-g164fe5dde9b6 #0
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS 1.12.0-1 04/01/2014
> RIP: 0010:kvm_vcpu_reset+0x1d2/0x1530 arch/x86/kvm/x86.c:12112
> Call Trace:
>  <TASK>
>  shutdown_interception+0x66/0xb0 arch/x86/kvm/svm/svm.c:2136
>  svm_invoke_exit_handler+0x110/0x530 arch/x86/kvm/svm/svm.c:3395
>  svm_handle_exit+0x424/0x920 arch/x86/kvm/svm/svm.c:3457
>  vcpu_enter_guest arch/x86/kvm/x86.c:10959 [inline]
>  vcpu_run+0x2c43/0x5a90 arch/x86/kvm/x86.c:11062
>  kvm_arch_vcpu_ioctl_run+0x50f/0x1cf0 arch/x86/kvm/x86.c:11283
>  kvm_vcpu_ioctl+0x570/0xf00 arch/x86/kvm/../../../virt/kvm/kvm_main.c:4122
>  vfs_ioctl fs/ioctl.c:51 [inline]
>  __do_sys_ioctl fs/ioctl.c:870 [inline]
>  __se_sys_ioctl fs/ioctl.c:856 [inline]
>  __x64_sys_ioctl+0x19a/0x210 fs/ioctl.c:856
>  do_syscall_x64 arch/x86/entry/common.c:51 [inline]
>  do_syscall_64+0x35/0x80 arch/x86/entry/common.c:81
>  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
> 
> Considering that hardware CPUs exit SMM mode completely upon receiving
> a triple fault by triggering a hardware reset (which inherently leads
> to exiting SMM), explicitly perform SMM exit prior to the WARN check.
> Although subsequent code clears vCPU hflags, including the SMM flag,
> calling kvm_smm_changed ensures the exit from SMM is handled correctly
> and explicitly, aligning precisely with hardware behavior.
> 
> 
> Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
> 
> Fixes: ed129ec9057f ("KVM: x86: forcibly leave nested mode on vCPU reset")
> Cc: stable@vger.kernel.org
> Signed-off-by: Mikhail Lobanov <m.lobanov@rosa.ru>
> ---
>  arch/x86/kvm/x86.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 4b64ab350bcd..f1c95c21703a 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -12409,6 +12409,9 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
>  	if (is_guest_mode(vcpu))
>  		kvm_leave_nested(vcpu);
>  
> +	if (is_smm(vcpu))
> +		kvm_smm_changed(vcpu, false);

Hmm, this probably belongs in SVM's shutdown_interception().  Architecturally,
INIT is blocked when the CPU is in SMM.  KVM's WARN() below is intended to guard
against KVM bugs more than anything else, e.g. if KVM emulates INIT when it shouldn't.

SHUTDOWN on SVM is a weird edge case where KVM needs to do _something_ sane with
the VMCB, since it's technically undefined, and INIT is the least awful choice given
KVM's ABI.

I can't think off any other paths that can/should force INIT while SMM is active,
so while it's a bit gross to have this as a one-off, I think we should do:

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index cc1c721ba067..5a2041bc1ba2 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -2231,6 +2231,8 @@ static int shutdown_interception(struct kvm_vcpu *vcpu)
         */
        if (!sev_es_guest(vcpu->kvm)) {
                clear_page(svm->vmcb);
+               if (is_smm(vcpu))
+                       kvm_smm_changed(vcpu, false);
                kvm_vcpu_reset(vcpu, true);
        }
 


> +
>  	kvm_lapic_reset(vcpu, init_event);
>  
>  	WARN_ON_ONCE(is_guest_mode(vcpu) || is_smm(vcpu));
> -- 
> 2.47.2
>

Re: [PATCH] KVM: x86: forcibly leave SMM mode on vCPU reset

Posted by Mikhail Lobanov 10 months, 1 week ago

Hi,

Gentle ping on this patch:  
https://lore.kernel.org/kvm/20250324175707.19925-1-m.lobanov@rosa.ru/

Sent on March 24, still waiting for feedback.  
Happy to update the patch if needed — let me know.

Thanks!

Best regards,  
Mikhail Lobanov

Re: [PATCH] KVM: x86: forcibly leave SMM mode on vCPU reset

Posted by Sean Christopherson 10 months, 1 week ago

On Tue, Apr 01, 2025, Mikhail Lobanov wrote:
> Hi,
> 
> Gentle ping on this patch:  
> https://lore.kernel.org/kvm/20250324175707.19925-1-m.lobanov@rosa.ru/
> 
> Sent on March 24, still waiting for feedback.  

It's in the queue.  From Documentation/process/maintainer-kvm-x86.rst:

Timeline
~~~~~~~~
Submissions are typically reviewed and applied in FIFO order, with some wiggle
room for the size of a series, patches that are "cache hot", etc.  Fixes,
especially for the current release and or stable trees, get to jump the queue.
Patches that will be taken through a non-KVM tree (most often through the tip
tree) and/or have other acks/reviews also jump the queue to some extent.

Note, the vast majority of review is done between rc1 and rc6, give or take.
The period between rc6 and the next rc1 is used to catch up on other tasks,
i.e. radio silence during this period isn't unusual.