From nobody Mon Feb 9 05:58:35 2026 Received: from mail-pg1-f202.google.com (mail-pg1-f202.google.com [209.85.215.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AA7F433067D for ; Thu, 29 Jan 2026 01:16:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769649393; cv=none; b=Gq/E9ZZ+8PACneFR6O9KflzFFYOknV+S28ZGNEs8behj0CX/heIQGuYL3EQrXroqDaKM7IZ9ZR/9XD/oE/HREUf4Odtg25IifQOI9Zdo51oe1fuOnDQ7h9aSEpO/lUloH1Nf8/6C/xwRWbj+4zQKggr99xAvptaIKvm+qor4Wz4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769649393; c=relaxed/simple; bh=KvOn9sdpHKW3QH4/pQ+ztFzBdmExEpDhGv9YuGKPiRo=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=JN9OCokPa00KyLxFAiAmlF0FL6yzmu6dRcuwqSPPOD6iqVtbWr8ZPii+ffQdv/uL6cnl4iVwYErqtQZ8Q2whlItV0SdXB569vbEuc75jF8bcYr49hkOfhOXz+TQXx3iPmMoMSwROxmkzM0XHeMHz7a2noybrGV5zi4emsj8//0g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=d8HNgDN9; arc=none smtp.client-ip=209.85.215.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="d8HNgDN9" Received: by mail-pg1-f202.google.com with SMTP id 41be03b00d2f7-b630753cc38so814987a12.1 for ; Wed, 28 Jan 2026 17:16:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1769649391; x=1770254191; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=czjQrUOc/hIkSrdYFZVTFF/3UmFL9ujoJHiw3Kb4R7I=; b=d8HNgDN9HVCg3MdK0RAyyRGxFP1wCgTfOeJLZ1/DBZTMdCNZBPY89yzij0It7hzp+E frEfTu41gDO3WOF5iBsiwNvCk/+CGs7RgMhtI8fIbnDyFkU02Nc//OpKaih+GXvy66Fg k+Uy8Pux0MNtVGavHWx7uKS6hWNFbNCvMphglCX49sRn+N0VNaAk75KHVWMQ1B8yCEaQ QtLV8Q4jXZaPHOEmiEqy/FmG88Ev7Ykx3/dO46jnA9rDjXvU8ghlRDwI1M1uvtefJNmq a31E11zBZhTwvIZ2aqWmQ2O1cwxCQ7wqy+GkFz18ZVCFjIWc+gN6mZ5jlPKmq8jTKx+t LI6g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1769649391; x=1770254191; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=czjQrUOc/hIkSrdYFZVTFF/3UmFL9ujoJHiw3Kb4R7I=; b=eIuUFKN1pPH7rcBm1fGhkUcW9kbL/nDNUwFgMX7nQ94pL4/yutkIp+Aq6x3ON1G4bU 6bhH1BEUM4zCGhVQ4boOldkimHzd6kYgYI55YNWDsTF6Wg2+qR1laq3ObhNRL1FSqe6a v5xDad3vZL4Cot0Q8ZxGLryFumlsKrX8wkkOgxrgfMexpNF58y1q8EXuTwUxo8mxKZpa mymuu62H9H7tSSnj1gG8sy41ZNke2fZ7wxTi8Gqf5LwBzc5FlWObMpIwMlThfgPR6OFK eNDNfZfEjEVzklazNzrO2TQlCkuwpPxDxhX51gX2XTyW7LniGSNwNA1ZHq/e+p3OPOEI 2wDQ== X-Gm-Message-State: AOJu0YzpoNxZrMhCp8DovH/GyE028JZF8Ps5Uk9cNHXZbbWe5Obbr0iL gFkfIzcZY7ivX2X7079FDAciT4WLXSS1Dd0mhsEnkY2gPlt3pG9YE64eJzzbUauMI11woTeCQep ssKgWnw== X-Received: from pfiu8.prod.google.com ([2002:a05:6a00:1248:b0:7cf:2dad:ff87]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:6088:b0:81f:997e:59a0 with SMTP id d2e1a72fcca58-823692fd850mr7473991b3a.64.1769649390925; Wed, 28 Jan 2026 17:16:30 -0800 (PST) Reply-To: Sean Christopherson Date: Wed, 28 Jan 2026 17:15:05 -0800 In-Reply-To: <20260129011517.3545883-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20260129011517.3545883-1-seanjc@google.com> X-Mailer: git-send-email 2.53.0.rc1.217.geba53bf80e-goog Message-ID: <20260129011517.3545883-34-seanjc@google.com> Subject: [RFC PATCH v5 33/45] KVM: TDX: Hoist tdx_sept_remove_private_spte() above set_private_spte() From: Sean Christopherson To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, Kiryl Shutsemau , Sean Christopherson , Paolo Bonzini Cc: linux-kernel@vger.kernel.org, linux-coco@lists.linux.dev, kvm@vger.kernel.org, Kai Huang , Rick Edgecombe , Yan Zhao , Vishal Annapurve , Ackerley Tng , Sagi Shahar , Binbin Wu , Xiaoyao Li , Isaku Yamahata Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Move tdx_sept_remove_private_spte() (and its tdx_track() helper) above tdx_sept_set_private_spte() in anticipation of routing all non-atomic S-EPT writes (with the exception of reclaiming non-leaf pages) through the "set" API. No functional change intended. Signed-off-by: Sean Christopherson --- arch/x86/kvm/vmx/tdx.c | 194 ++++++++++++++++++++--------------------- 1 file changed, 97 insertions(+), 97 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index e451acdb0978..0f3d27699a3d 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -1670,6 +1670,52 @@ static int tdx_mem_page_aug(struct kvm *kvm, gfn_t g= fn, return 0; } =20 +/* + * Ensure shared and private EPTs to be flushed on all vCPUs. + * tdh_mem_track() is the only caller that increases TD epoch. An increase= in + * the TD epoch (e.g., to value "N + 1") is successful only if no vCPUs are + * running in guest mode with the value "N - 1". + * + * A successful execution of tdh_mem_track() ensures that vCPUs can only r= un in + * guest mode with TD epoch value "N" if no TD exit occurs after the TD ep= och + * being increased to "N + 1". + * + * Kicking off all vCPUs after that further results in no vCPUs can run in= guest + * mode with TD epoch value "N", which unblocks the next tdh_mem_track() (= e.g. + * to increase TD epoch to "N + 2"). + * + * TDX module will flush EPT on the next TD enter and make vCPUs to run in + * guest mode with TD epoch value "N + 1". + * + * kvm_make_all_cpus_request() guarantees all vCPUs are out of guest mode = by + * waiting empty IPI handler ack_kick(). + * + * No action is required to the vCPUs being kicked off since the kicking o= ff + * occurs certainly after TD epoch increment and before the next + * tdh_mem_track(). + */ +static void tdx_track(struct kvm *kvm) +{ + struct kvm_tdx *kvm_tdx =3D to_kvm_tdx(kvm); + u64 err; + + /* If TD isn't finalized, it's before any vcpu running. */ + if (unlikely(kvm_tdx->state !=3D TD_STATE_RUNNABLE)) + return; + + /* + * The full sequence of TDH.MEM.TRACK and forcing vCPUs out of guest + * mode must be serialized, as TDH.MEM.TRACK will fail if the previous + * tracking epoch hasn't completed. + */ + lockdep_assert_held_write(&kvm->mmu_lock); + + err =3D tdh_do_no_vcpus(tdh_mem_track, kvm, &kvm_tdx->td); + TDX_BUG_ON(err, TDH_MEM_TRACK, kvm); + + kvm_make_all_cpus_request(kvm, KVM_REQ_OUTSIDE_GUEST_MODE); +} + static struct page *tdx_spte_to_external_spt(struct kvm *kvm, gfn_t gfn, u64 new_spte, enum pg_level level) { @@ -1705,6 +1751,57 @@ static int tdx_sept_link_private_spt(struct kvm *kvm= , gfn_t gfn, return 0; } =20 +static void tdx_sept_remove_private_spte(struct kvm *kvm, gfn_t gfn, + enum pg_level level, u64 mirror_spte) +{ + struct kvm_tdx *kvm_tdx =3D to_kvm_tdx(kvm); + kvm_pfn_t pfn =3D spte_to_pfn(mirror_spte); + gpa_t gpa =3D gfn_to_gpa(gfn); + u64 err, entry, level_state; + + lockdep_assert_held_write(&kvm->mmu_lock); + + /* + * HKID is released after all private pages have been removed, and set + * before any might be populated. Warn if zapping is attempted when + * there can't be anything populated in the private EPT. + */ + if (KVM_BUG_ON(!is_hkid_assigned(to_kvm_tdx(kvm)), kvm)) + return; + + /* TODO: handle large pages. */ + if (KVM_BUG_ON(level !=3D PG_LEVEL_4K, kvm)) + return; + + err =3D tdh_do_no_vcpus(tdh_mem_range_block, kvm, &kvm_tdx->td, gpa, + level, &entry, &level_state); + if (TDX_BUG_ON_2(err, TDH_MEM_RANGE_BLOCK, entry, level_state, kvm)) + return; + + /* + * TDX requires TLB tracking before dropping private page. Do + * it here, although it is also done later. + */ + tdx_track(kvm); + + /* + * When zapping private page, write lock is held. So no race condition + * with other vcpu sept operation. + * Race with TDH.VP.ENTER due to (0-step mitigation) and Guest TDCALLs. + */ + err =3D tdh_do_no_vcpus(tdh_mem_page_remove, kvm, &kvm_tdx->td, gpa, + level, &entry, &level_state); + if (TDX_BUG_ON_2(err, TDH_MEM_PAGE_REMOVE, entry, level_state, kvm)) + return; + + err =3D tdh_phymem_page_wbinvd_hkid((u16)kvm_tdx->hkid, pfn, level); + if (TDX_BUG_ON(err, TDH_PHYMEM_PAGE_WBINVD, kvm)) + return; + + __tdx_quirk_reset_page(pfn, level); + tdx_pamt_put(pfn, level); +} + static int tdx_sept_set_private_spte(struct kvm *kvm, gfn_t gfn, u64 old_s= pte, u64 new_spte, enum pg_level level) { @@ -1756,52 +1853,6 @@ static int tdx_sept_set_private_spte(struct kvm *kvm= , gfn_t gfn, u64 old_spte, return ret; } =20 -/* - * Ensure shared and private EPTs to be flushed on all vCPUs. - * tdh_mem_track() is the only caller that increases TD epoch. An increase= in - * the TD epoch (e.g., to value "N + 1") is successful only if no vCPUs are - * running in guest mode with the value "N - 1". - * - * A successful execution of tdh_mem_track() ensures that vCPUs can only r= un in - * guest mode with TD epoch value "N" if no TD exit occurs after the TD ep= och - * being increased to "N + 1". - * - * Kicking off all vCPUs after that further results in no vCPUs can run in= guest - * mode with TD epoch value "N", which unblocks the next tdh_mem_track() (= e.g. - * to increase TD epoch to "N + 2"). - * - * TDX module will flush EPT on the next TD enter and make vCPUs to run in - * guest mode with TD epoch value "N + 1". - * - * kvm_make_all_cpus_request() guarantees all vCPUs are out of guest mode = by - * waiting empty IPI handler ack_kick(). - * - * No action is required to the vCPUs being kicked off since the kicking o= ff - * occurs certainly after TD epoch increment and before the next - * tdh_mem_track(). - */ -static void tdx_track(struct kvm *kvm) -{ - struct kvm_tdx *kvm_tdx =3D to_kvm_tdx(kvm); - u64 err; - - /* If TD isn't finalized, it's before any vcpu running. */ - if (unlikely(kvm_tdx->state !=3D TD_STATE_RUNNABLE)) - return; - - /* - * The full sequence of TDH.MEM.TRACK and forcing vCPUs out of guest - * mode must be serialized, as TDH.MEM.TRACK will fail if the previous - * tracking epoch hasn't completed. - */ - lockdep_assert_held_write(&kvm->mmu_lock); - - err =3D tdh_do_no_vcpus(tdh_mem_track, kvm, &kvm_tdx->td); - TDX_BUG_ON(err, TDH_MEM_TRACK, kvm); - - kvm_make_all_cpus_request(kvm, KVM_REQ_OUTSIDE_GUEST_MODE); -} - static void tdx_sept_reclaim_private_sp(struct kvm *kvm, gfn_t gfn, struct kvm_mmu_page *sp) { @@ -1824,57 +1875,6 @@ static void tdx_sept_reclaim_private_sp(struct kvm *= kvm, gfn_t gfn, sp->external_spt =3D NULL; } =20 -static void tdx_sept_remove_private_spte(struct kvm *kvm, gfn_t gfn, - enum pg_level level, u64 mirror_spte) -{ - struct kvm_tdx *kvm_tdx =3D to_kvm_tdx(kvm); - kvm_pfn_t pfn =3D spte_to_pfn(mirror_spte); - gpa_t gpa =3D gfn_to_gpa(gfn); - u64 err, entry, level_state; - - lockdep_assert_held_write(&kvm->mmu_lock); - - /* - * HKID is released after all private pages have been removed, and set - * before any might be populated. Warn if zapping is attempted when - * there can't be anything populated in the private EPT. - */ - if (KVM_BUG_ON(!is_hkid_assigned(to_kvm_tdx(kvm)), kvm)) - return; - - /* TODO: handle large pages. */ - if (KVM_BUG_ON(level !=3D PG_LEVEL_4K, kvm)) - return; - - err =3D tdh_do_no_vcpus(tdh_mem_range_block, kvm, &kvm_tdx->td, gpa, - level, &entry, &level_state); - if (TDX_BUG_ON_2(err, TDH_MEM_RANGE_BLOCK, entry, level_state, kvm)) - return; - - /* - * TDX requires TLB tracking before dropping private page. Do - * it here, although it is also done later. - */ - tdx_track(kvm); - - /* - * When zapping private page, write lock is held. So no race condition - * with other vcpu sept operation. - * Race with TDH.VP.ENTER due to (0-step mitigation) and Guest TDCALLs. - */ - err =3D tdh_do_no_vcpus(tdh_mem_page_remove, kvm, &kvm_tdx->td, gpa, - level, &entry, &level_state); - if (TDX_BUG_ON_2(err, TDH_MEM_PAGE_REMOVE, entry, level_state, kvm)) - return; - - err =3D tdh_phymem_page_wbinvd_hkid((u16)kvm_tdx->hkid, pfn, level); - if (TDX_BUG_ON(err, TDH_PHYMEM_PAGE_WBINVD, kvm)) - return; - - __tdx_quirk_reset_page(pfn, level); - tdx_pamt_put(pfn, level); -} - void tdx_deliver_interrupt(struct kvm_lapic *apic, int delivery_mode, int trig_mode, int vector) { --=20 2.53.0.rc1.217.geba53bf80e-goog