From nobody Mon Jun 8 08:52:46 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5A006427A1A for ; Thu, 4 Jun 2026 16:07:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780589265; cv=none; b=V5/aciaAH2iwrCxCUWWbIgT3c1nR2Uir4E8BheYKvlc9d+FNAWTHR+pIONVza47zt0XT/GF7bugzdGvAbkbTP62SnzVlOT/HZcs8oBDwsI3i3E/pSvJfpJP2zOa4qZzbg1hzawMoAU2Ja0E5Fu6VeIAVq+5m6St4Ru9oXbLOuYQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780589265; c=relaxed/simple; bh=J5tZ9NNEIM9T2w2xibNmTr36mtNoOPW2WDlOYovucJQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=YA7E6Crvzaiu3xg2LMguZ2aVTO5x8574+9pb2vw3td9qVp+orxDpmsmMroM0odk45RWrL+uQLLRNvRMGpP8FP+BGZc9JHlvhHlvAtfHLz7hxgDSWLXm015l03fdyKm90dBs359npiQCvQsCLz4+yPw6yKy8UTNvPiVOsS5tjav4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=YwttSjS/; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="YwttSjS/" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1780589260; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3+/3qq3vcNnMSlbovUtLsbCgSAA5l6kz17jqK9FXU+0=; b=YwttSjS/5uPSrA8DWVbL+X7DX9RL8rYVDpKX3U68OrH2GJ59M2t+SVSplf5qhhNNsNkiMk YLsNge7gcRun5tAtzf3yZ8pvfszp9LaXu+nUb/62VT400d/qIXXsZL4Fq9dXiD327SVlzf 4XQRicTw3+Ob9okWktPP4JDq+3+94yk= Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-557-vNOPzmBhNGGYrQaRoVI2SA-1; Thu, 04 Jun 2026 12:07:36 -0400 X-MC-Unique: vNOPzmBhNGGYrQaRoVI2SA-1 X-Mimecast-MFC-AGG-ID: vNOPzmBhNGGYrQaRoVI2SA_1780589255 Received: from mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.111]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 25729195608E; Thu, 4 Jun 2026 16:07:35 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id A14BB1800347; Thu, 4 Jun 2026 16:07:34 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: stable@vger.kernel.org Subject: [PATCH 1/3] KVM: nVMX: unwind PDPTR load if processor triggers a nested VMFail Date: Thu, 4 Jun 2026 12:07:31 -0400 Message-ID: <20260604160733.12555-2-pbonzini@redhat.com> In-Reply-To: <20260604160733.12555-1-pbonzini@redhat.com> References: <20260604160733.12555-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.111 Content-Type: text/plain; charset="utf-8" Upon a VM-entry failure that is caught by the processor rather than KVM, nested_vmx_restore_host_state() restores L1's CR3 but not the PDPTRs. If shadow paging is used (enable_ept is false), the L2 PDPTRs loaded during the aborted entry attempt remain in vcpu->arch.mmu->pdptrs[]. Note that the fact that the PDPTRs are stored in the MMU does not save the day, because KVM only uses root_mmu if enable_ept is false. To fix this, use nested_vmx_load_cr3() instead of open coding just the load of vcpu->arch.cr3, in the same guise as load_vmcs12_host_state(). nested_vmx_load_cr3() will mark the register as dirty rather than available, but this is only a very minor pessimization. If EPT *is* in use, do not load the PDPTRs and rely solely on ept_save_pdptrs() to reload them from VMCS01. When vmx_load_mmu_pgd() runs on the next entry, the PDPTRs are available---meaning they are not incorrectly reloaded from memory. kvm_mmu_unload() is preserved to keep the paths from the old kvm_mmu_reset_context(), but is actually unnecessary. It can be removed as a separate patch. Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini --- arch/x86/kvm/vmx/nested.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 4690a4d23709..d612a5d071fc 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -4947,6 +4947,7 @@ static inline u64 nested_vmx_get_vmcs01_guest_efer(st= ruct vcpu_vmx *vmx) =20 static void nested_vmx_restore_host_state(struct kvm_vcpu *vcpu) { + enum vm_entry_failure_code ignored; struct vmcs12 *vmcs12 =3D get_vmcs12(vcpu); struct vcpu_vmx *vmx =3D to_vmx(vcpu); struct vmx_msr_entry g, h; @@ -4984,20 +4985,19 @@ static void nested_vmx_restore_host_state(struct kv= m_vcpu *vcpu) vmx_set_cr4(vcpu, vmcs_readl(CR4_READ_SHADOW)); =20 nested_ept_uninit_mmu_context(vcpu); - vcpu->arch.cr3 =3D vmcs_readl(GUEST_CR3); - kvm_register_mark_available(vcpu, VCPU_REG_CR3); =20 /* - * Use ept_save_pdptrs(vcpu) to load the MMU's cached PDPTRs - * from vmcs01 (if necessary). The PDPTRs are not loaded on - * VMFail, like everything else we just need to ensure our - * software model is up-to-date. + * Now that nested EPT has been disabled, load the MMU's CR3 and + * possibly PDPTRs from vmcs01 (if necessary). This should not + * happen for VMFail, but we get here if the check was caught by + * the processor and therefore the guest CR3 was loaded prematurely. */ + kvm_mmu_unload(vcpu); + if (nested_vmx_load_cr3(vcpu, vmcs_readl(GUEST_CR3), false, !enable_ept, = &ignored)) + nested_vmx_abort(vcpu, VMX_ABORT_LOAD_HOST_PDPTE_FAIL); if (enable_ept && is_pae_paging(vcpu)) ept_save_pdptrs(vcpu); =20 - kvm_mmu_reset_context(vcpu); - /* * This nasty bit of open coding is a compromise between blindly * loading L1's MSRs using the exit load lists (incorrect emulation --=20 2.52.0 From nobody Mon Jun 8 08:52:46 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 43A853E0C78 for ; Thu, 4 Jun 2026 16:07:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780589264; cv=none; b=apXytLOenN8XflYgVqQvZ8ka433A7olWNqZuPhe4HxhponJ6wSEGEWGhF2341ASXodyMie6s2xzTqDbrGxz0bwHF/z3Y8T70NkqENWiFspH6yD2EWj+npZlgxBWCGEK5Viai4rnlDXJGwHCLkZQ/GrB1h8E5uLqWDGxWAKqQINw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780589264; c=relaxed/simple; bh=eXCfBQxr6XhjYxfqbjf/SUCSUo3mXtSy/N0Lk0E51Qg=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=ZDr9SFju75c8SVTdTtTjAazorYw95IE8iADPl1TfUQYWdVHGpZNxhUbdVB4A0GWP7wyOdW4okuPQ3lTAF7BKvaDltpRYCRS5HGLnS91YN1LIOIB1R64s9owa/uCRRq1blhkTomp0o4F8rgskUFkEtWgW9oPB+/tjSUkqOuVed/g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=Wew68dOo; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Wew68dOo" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1780589258; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NGk2tAGALcIvdpvz46eMxrvPgOoXQWtxWoYDrjMwGww=; b=Wew68dOoMfyuWpLi5vICC8uz8m4pkdJGf66FfEqm3ZWlK/olneYum/To7nI+g2yVIgG2K6 tXTPC+vb7kge/nAOwOeIuUe7w3INbZ7H81RquX2FsYCO2JnRCtLcQMAdVSTUJyKuYFsnzc lQvmgnWTv4dslXQA6aOtrUxpLOc4xL4= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-91-YfYeqdplNuWMLuzUvGHnBQ-1; Thu, 04 Jun 2026 12:07:36 -0400 X-MC-Unique: YfYeqdplNuWMLuzUvGHnBQ-1 X-Mimecast-MFC-AGG-ID: YfYeqdplNuWMLuzUvGHnBQ_1780589255 Received: from mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.111]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id A75071800366; Thu, 4 Jun 2026 16:07:35 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 4BAF4180049F; Thu, 4 Jun 2026 16:07:35 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Subject: [PATCH 2/3] KVM: MMU: unconditionally clear MMIO cache on root rebuild Date: Thu, 4 Jun 2026 12:07:32 -0400 Message-ID: <20260604160733.12555-3-pbonzini@redhat.com> In-Reply-To: <20260604160733.12555-1-pbonzini@redhat.com> References: <20260604160733.12555-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.111 Content-Type: text/plain; charset="utf-8" Upon changing CR3, the MMIO cache becomes invalid because the GVA->GPA mapping has changed. However, kvm_load_new_pgd() calls vcpu_clear_mmio_info() call only if the fast switch succeeded. The early-return path instead leaves the root invalid; the next entry then calls kvm_mmu_reload() and from there kvm_mmu_load(). kvm_mmu_load() calls kvm_mmu_sync_roots(), which clears the MMIO cache, but one combination that falls through is root_role.direct=3D=3D1, i.e. CR0.PG=3D0, for which kvm_mmu_sync_roots() bails before reaching the call to vcpu_clear_mmio_info(). That combination is barely reachable: a valid direct root is pretty much always a fast-switch success because it does not check the PGD for a match. The early return for a direct root thus requires the current root to already be invalid, and kvm_mmu_unload() itself clears the MMIO cache. That said, doing an independent clear in the style of kvm_mmu_new_pgd() is more obviously correct and basically free, so harden it. Signed-off-by: Paolo Bonzini --- arch/x86/kvm/mmu/mmu.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index f8aa7eda661e..6689c9f8ae16 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -6138,6 +6138,7 @@ int kvm_mmu_load(struct kvm_vcpu *vcpu) if (r) goto out; =20 + vcpu_clear_mmio_info(vcpu, MMIO_GVA_ANY); kvm_mmu_sync_roots(vcpu); =20 kvm_mmu_load_pgd(vcpu); --=20 2.52.0 From nobody Mon Jun 8 08:52:46 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 610B33E556C for ; Thu, 4 Jun 2026 16:07:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780589261; cv=none; b=QiFbv5mtiUvioGmjPyK9EwGLFoxri4BG60nmNYMaBDoAoOU/fn+tF97v0+YtjaadWv3oC9tYnTK4c01B4GP8WNvwCq4uP0lE9PTa9erJiyBmBbpO6n2xeSK4o+7p4yX0LtpHo1DlQ2Rf7NPwP+/0L9U2jrQSv7K5PuWnmaFRpzs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780589261; c=relaxed/simple; bh=U2yHoATeiKZnxRFrses8J6+z7btzUDLG+HIwys/A+IA=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=fFaXc4+UigtcRfMBXyFh3VFg3bR/fLx6rYio+p6/vQ+yqyQUFoPOBF+Kf7rssVTmUUSNPCdlRaUhoXFTKEtflufHvsCQ4g7yvw2/5N55A6Vemtg+riofWH3vJsi93LbwuCUHLY/MA7vHhH2GR2crMyqQ90168SjeI/FxiIJOkH8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=fqjbfic3; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="fqjbfic3" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1780589258; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3xyNm/5J5inCwa00MX+JhdhYcLAbTmBPAp1XoCNYfG0=; b=fqjbfic3yhriEX0XH7iew+Hei+R2f/Yy2JWPCNx1q7UXgJMCOmiLVzl/UuC8/lcrlVli3k QXd4Wl7kR8NJlIjpZizPGIJ/Ln6bh+dlK3Bybc2GNzIhmz1+NJID3r0KSi5310uNF50ppb TXGpbf+Oj45vsE8azcrQyw5HYwtBKZU= Received: from mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-640-DO7h4FocM8uLeqZSnXuWfA-1; Thu, 04 Jun 2026 12:07:37 -0400 X-MC-Unique: DO7h4FocM8uLeqZSnXuWfA-1 X-Mimecast-MFC-AGG-ID: DO7h4FocM8uLeqZSnXuWfA_1780589256 Received: from mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.111]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 33FC8195608E; Thu, 4 Jun 2026 16:07:36 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id CD1FD180049F; Thu, 4 Jun 2026 16:07:35 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Subject: [PATCH 3/3] KVM: nVMX: remove unnecessary unload on processor-detected VMFail Date: Thu, 4 Jun 2026 12:07:33 -0400 Message-ID: <20260604160733.12555-4-pbonzini@redhat.com> In-Reply-To: <20260604160733.12555-1-pbonzini@redhat.com> References: <20260604160733.12555-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.111 Content-Type: text/plain; charset="utf-8" nested_vmx_restore_host_state() is following a similar scheme to load_vmcs12_host_state() which does not need a kvm_mmu_unload(). So, does nested_vmx_restore_host_state() need it? The answer is no. In the shadow case, kvm_init_mmu() in nested_vmx_load_cr3() is enough to set a root_role with guest_mode=3D=3D0. kvm_mmu_new_pgd() then is now able to reuse an old root. In the EPT case, root_mmu still holds L1's valid root because L2 used guest_mmu. Removing kvm_mmu_unload() thus is marginally more efficient and it makes the two host state restore paths identical. The other thing that kvm_mmu_unload() does is clearing the MMIO GVA cache. This was ensured previously by calling vcpu_clear_mmio_info() from kvm_mmu_load() rather than just kvm_mmu_new_pgd(). Signed-off-by: Paolo Bonzini --- arch/x86/kvm/vmx/nested.c | 1 - 1 file changed, 1 deletion(-) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index d612a5d071fc..8b20a5eac1c9 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -4992,7 +4992,6 @@ static void nested_vmx_restore_host_state(struct kvm_= vcpu *vcpu) * happen for VMFail, but we get here if the check was caught by * the processor and therefore the guest CR3 was loaded prematurely. */ - kvm_mmu_unload(vcpu); if (nested_vmx_load_cr3(vcpu, vmcs_readl(GUEST_CR3), false, !enable_ept, = &ignored)) nested_vmx_abort(vcpu, VMX_ABORT_LOAD_HOST_PDPTE_FAIL); if (enable_ept && is_pae_paging(vcpu)) --=20 2.52.0