From nobody Thu Sep 18 02:37:58 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 44DECC4332F for ; Tue, 13 Dec 2022 06:10:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234536AbiLMGKJ (ORCPT ); Tue, 13 Dec 2022 01:10:09 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47006 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234461AbiLMGJu (ORCPT ); Tue, 13 Dec 2022 01:09:50 -0500 Received: from mail-pj1-x1049.google.com (mail-pj1-x1049.google.com [IPv6:2607:f8b0:4864:20::1049]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8BE721ADA8 for ; Mon, 12 Dec 2022 22:09:28 -0800 (PST) Received: by mail-pj1-x1049.google.com with SMTP id k7-20020a17090a39c700b002192c16f19aso1347144pjf.1 for ; Mon, 12 Dec 2022 22:09:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=FqdJjiwJ+3KBa9yJbuoyosmHEFIdRQ7rtSq2Gl3y3sA=; b=L0uRFcqMg4zxrXLXzS5agUTJtSHNHZaZB/Umstjib0KDLbzEMDkaIE5IYILiNh4ihO Ks61vZYMKGfYWtht5ScwwNerE+C+y6uWagJ0CcBgoMLA2MHnwrngn38QzloYlobOUjZ2 LyWEcAhE8EP0Ftp6XZJM/ey9/GhxkepiJvO7Zdo3G+MDxVA6RcDRZc9XqkSiJ0pa5KsN GjAVmtse3/tNYkngK2mewx7lLyeujrbP9aRghooZ2FUrMr++mTifKKsdKXJjGPNfl4VC EVTOIpaCOflVepNg538v55OTMIVlEOvHsRVJ0ZKCh92IxSJ3cgUee0q54Mfe68Bz0vwf ej5A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=FqdJjiwJ+3KBa9yJbuoyosmHEFIdRQ7rtSq2Gl3y3sA=; b=yA+JhqbwSL3/J4aIEaFlAPY+4aLnwsuaVBZj0qJrFcvEORq+HbiPRnCYrp16OY3WRk GNDRQeIkx4/NPqdEg5hETYZ7RjNUsyj2a1NhNR7nVB8DNJrp2VExduWPC5gGiELCNj3d j4+r3/1ju3cv+BxLn7o0eqc937Gm7U2mzCXpiu3xqXjzcinssccAlbgjcsc7xscpF3Rx ASQABG2fZBhPqTfPw1X6rwCpXKDE356H6il6JEorV5yUSYs8PoU0yV2s6jpFs9TZfUYd UcGvjb4pVzP/wFXv7EVICjK7KeGLiiyKthqr2o10VCTVlN4ZPwqyE9Nn8ODIQzRt6jdO KvUw== X-Gm-Message-State: ANoB5pkHxviwmmyW9y+wI53RfLVfwdn5IqIYEWRubppvQTT9+h3cCMrP e3zxQ11Mdv468qqp1sXPJ/6xqr1TsoQ= X-Google-Smtp-Source: AA0mqf68WNGPX3LjDP06mA0PiEMAKBsyVZ5/dIfSh6Ot3lsLIv8000phHCdV7ZPecbA8WXpdCc4Ktcn+8wY= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:aa7:85d9:0:b0:577:53a9:836c with SMTP id z25-20020aa785d9000000b0057753a9836cmr10672135pfn.5.1670911768146; Mon, 12 Dec 2022 22:09:28 -0800 (PST) Reply-To: Sean Christopherson Date: Tue, 13 Dec 2022 06:09:12 +0000 In-Reply-To: <20221213060912.654668-1-seanjc@google.com> Mime-Version: 1.0 References: <20221213060912.654668-1-seanjc@google.com> X-Mailer: git-send-email 2.39.0.rc1.256.g54fd8350bd-goog Message-ID: <20221213060912.654668-8-seanjc@google.com> Subject: [PATCH 7/7] KVM: VMX: Handle NMI VM-Exits in noinstr region From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Peter Zijlstra , Andy Lutomirski , Thomas Gleixner Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Move VMX's handling of NMI VM-Exits into vmx_vcpu_enter_exit() so that the NMI is handled prior to leaving the safety of noinstr. Handling the NMI after leaving noinstr exposes the kernel to potential ordering problems as an instrumentation-induced fault, e.g. #DB, #BP, #PF, etc. will unblock NMIs when IRETing back to the faulting instruction. Reported-by: Peter Zijlstra Signed-off-by: Sean Christopherson Reported-by: Like Xu --- arch/x86/kvm/vmx/vmcs.h | 4 ++-- arch/x86/kvm/vmx/vmenter.S | 8 ++++---- arch/x86/kvm/vmx/vmx.c | 34 +++++++++++++++++++++------------- arch/x86/kvm/x86.h | 6 +++--- 4 files changed, 30 insertions(+), 22 deletions(-) diff --git a/arch/x86/kvm/vmx/vmcs.h b/arch/x86/kvm/vmx/vmcs.h index ac290a44a693..7c1996b433e2 100644 --- a/arch/x86/kvm/vmx/vmcs.h +++ b/arch/x86/kvm/vmx/vmcs.h @@ -75,7 +75,7 @@ struct loaded_vmcs { struct vmcs_controls_shadow controls_shadow; }; =20 -static inline bool is_intr_type(u32 intr_info, u32 type) +static __always_inline bool is_intr_type(u32 intr_info, u32 type) { const u32 mask =3D INTR_INFO_VALID_MASK | INTR_INFO_INTR_TYPE_MASK; =20 @@ -146,7 +146,7 @@ static inline bool is_icebp(u32 intr_info) return is_intr_type(intr_info, INTR_TYPE_PRIV_SW_EXCEPTION); } =20 -static inline bool is_nmi(u32 intr_info) +static __always_inline bool is_nmi(u32 intr_info) { return is_intr_type(intr_info, INTR_TYPE_NMI_INTR); } diff --git a/arch/x86/kvm/vmx/vmenter.S b/arch/x86/kvm/vmx/vmenter.S index 9d987e7e48c4..059243085211 100644 --- a/arch/x86/kvm/vmx/vmenter.S +++ b/arch/x86/kvm/vmx/vmenter.S @@ -299,6 +299,10 @@ SYM_INNER_LABEL(vmx_vmexit, SYM_L_GLOBAL) =20 SYM_FUNC_END(__vmx_vcpu_run) =20 +SYM_FUNC_START(vmx_do_nmi_irqoff) + VMX_DO_EVENT_IRQOFF call asm_exc_nmi_kvm_vmx +SYM_FUNC_END(vmx_do_nmi_irqoff) + =20 .section .text, "ax" =20 @@ -353,10 +357,6 @@ SYM_FUNC_START(vmread_error_trampoline) SYM_FUNC_END(vmread_error_trampoline) #endif =20 -SYM_FUNC_START(vmx_do_nmi_irqoff) - VMX_DO_EVENT_IRQOFF call asm_exc_nmi_kvm_vmx -SYM_FUNC_END(vmx_do_nmi_irqoff) - SYM_FUNC_START(vmx_do_interrupt_irqoff) VMX_DO_EVENT_IRQOFF CALL_NOSPEC _ASM_ARG1 SYM_FUNC_END(vmx_do_interrupt_irqoff) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index c242e2591896..b03020ca1840 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -5095,8 +5095,13 @@ static int handle_exception_nmi(struct kvm_vcpu *vcp= u) vect_info =3D vmx->idt_vectoring_info; intr_info =3D vmx_get_intr_info(vcpu); =20 + /* + * Machine checks are handled by handle_exception_irqoff(), or by + * vmx_vcpu_run() if a #MC occurs on VM-Entry. NMIs are handled by + * vmx_vcpu_enter_exit(). + */ if (is_machine_check(intr_info) || is_nmi(intr_info)) - return 1; /* handled by handle_exception_nmi_irqoff() */ + return 1; =20 /* * Queue the exception here instead of in handle_nm_fault_irqoff(). @@ -6809,7 +6814,7 @@ static void handle_nm_fault_irqoff(struct kvm_vcpu *v= cpu) rdmsrl(MSR_IA32_XFD_ERR, vcpu->arch.guest_fpu.xfd_err); } =20 -static void handle_exception_nmi_irqoff(struct vcpu_vmx *vmx) +static void handle_exception_irqoff(struct vcpu_vmx *vmx) { u32 intr_info =3D vmx_get_intr_info(&vmx->vcpu); =20 @@ -6822,12 +6827,6 @@ static void handle_exception_nmi_irqoff(struct vcpu_= vmx *vmx) /* Handle machine checks before interrupts are enabled */ else if (is_machine_check(intr_info)) kvm_machine_check(); - /* We need to handle NMIs before interrupts are enabled */ - else if (is_nmi(intr_info)) { - kvm_before_interrupt(&vmx->vcpu, KVM_HANDLING_NMI); - vmx_do_nmi_irqoff(); - kvm_after_interrupt(&vmx->vcpu); - } } =20 static void handle_external_interrupt_irqoff(struct kvm_vcpu *vcpu) @@ -6857,7 +6856,7 @@ static void vmx_handle_exit_irqoff(struct kvm_vcpu *v= cpu) if (vmx->exit_reason.basic =3D=3D EXIT_REASON_EXTERNAL_INTERRUPT) handle_external_interrupt_irqoff(vcpu); else if (vmx->exit_reason.basic =3D=3D EXIT_REASON_EXCEPTION_NMI) - handle_exception_nmi_irqoff(vmx); + handle_exception_irqoff(vmx); } =20 /* @@ -7119,6 +7118,18 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_v= cpu *vcpu, =20 vmx_enable_fb_clear(vmx); =20 + if (unlikely(vmx->fail)) + vmx->exit_reason.full =3D 0xdead; + else + vmx->exit_reason.full =3D vmcs_read32(VM_EXIT_REASON); + + if ((u16)vmx->exit_reason.basic =3D=3D EXIT_REASON_EXCEPTION_NMI && + is_nmi(vmx_get_intr_info(vcpu))) { + kvm_before_interrupt(vcpu, KVM_HANDLING_NMI); + vmx_do_nmi_irqoff(); + kvm_after_interrupt(vcpu); + } + guest_state_exit_irqoff(); } =20 @@ -7260,12 +7271,9 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu) =20 vmx->idt_vectoring_info =3D 0; =20 - if (unlikely(vmx->fail)) { - vmx->exit_reason.full =3D 0xdead; + if (unlikely(vmx->fail)) return EXIT_FASTPATH_NONE; - } =20 - vmx->exit_reason.full =3D vmcs_read32(VM_EXIT_REASON); if (unlikely((u16)vmx->exit_reason.basic =3D=3D EXIT_REASON_MCE_DURING_VM= ENTRY)) kvm_machine_check(); =20 diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index 9de72586f406..44d1827f0a30 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -382,13 +382,13 @@ enum kvm_intr_type { KVM_HANDLING_NMI, }; =20 -static inline void kvm_before_interrupt(struct kvm_vcpu *vcpu, - enum kvm_intr_type intr) +static __always_inline void kvm_before_interrupt(struct kvm_vcpu *vcpu, + enum kvm_intr_type intr) { WRITE_ONCE(vcpu->arch.handling_intr_from_guest, (u8)intr); } =20 -static inline void kvm_after_interrupt(struct kvm_vcpu *vcpu) +static __always_inline void kvm_after_interrupt(struct kvm_vcpu *vcpu) { WRITE_ONCE(vcpu->arch.handling_intr_from_guest, 0); } --=20 2.39.0.rc1.256.g54fd8350bd-goog