From nobody Mon Feb 9 00:00:56 2026 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E324433554F for ; Thu, 29 Jan 2026 01:15:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769649348; cv=none; b=L51Iik3eu7D/EkS6l3mAyfX1ZDlweZhMti+QWKfeZoRmaviVe5pM5hBLq/VlBYZ5K68YtyyRAIUNYBh9zh1BeUk8PuQfVh64M4i6L+HGzvJt2zyH3bA4YCMRXMbREQdV2dUROJrpOYGPXXxQKPXsG/yGHgjCZVp5p5rJaYqDVhk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769649348; c=relaxed/simple; bh=yVpImpTFQZriVyFQbj/bWHkCHJ5SUUu9kYYN2lYkaKM=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=cfe6gSWAef49Pb0noOiPJK3upHRO1T/e1nFD0w5y6ASjWg1sb2N2Xu45WpI1EH5Cd0KB4+MhhpgsXj1J1Jo0Xl6c7SACy2bNsLSszPLqRqLFO8msTSeCJhqPZpIkJCVN0FxpB+X8MVFnpuXK70FEArh9HoQh6QylqA/Z52EFRis= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=4LkUNy2a; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="4LkUNy2a" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-34ec823527eso661561a91.2 for ; Wed, 28 Jan 2026 17:15:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1769649344; x=1770254144; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=p4DxLrcS3WgLaR+fYgPLmC8YCbLvAUFJ1rM8P1CFw/Y=; b=4LkUNy2alYrfyl04ZgQlm5uL+2WzFB/tbjaLUBWqrBOV7HeurA/0YyvChECR7+xiJF bBAI2a5JrQsFaNofX15rT4HFEXoBKaR9PzpCVYHy5tAMDPfu7NXcm8x/PrHk0AOrKpUd VRaXURdQ0xAiDm4TVDCrwDDGnsE1N9vIhFovkIEw27j68iRfLaL6CO5QCUFd5w9/UwoB ZiXMbkN2uAxZUHqJKdH4gP+iEvK9qECbX+gBAlgTMC+fO7xvbjsyeO3XIWp8YE6lrac1 LPRt2Gf9hpofFfwfHSE3rF7wWtWwf19oF+GjZKsF+lHGrnxUtI67bgbXT+dOIp3yb819 I5lQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1769649344; x=1770254144; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=p4DxLrcS3WgLaR+fYgPLmC8YCbLvAUFJ1rM8P1CFw/Y=; b=ETjcL5n0008nOwWH9QRKu42HBy6xxh7hpMJDB9/oO8FnP5g/RpsiHi5FVB4d7DNG98 IY2q9esl1UHnF6or+uKPWO9+LFlOHLtsBJx10QuMdoozlEHoLRYq7BwmfvXPupPB0E1N MNZYscLfcKOqdjL9mgn1sKu4u9I0L8tyyIgeV1ni0Aydjj0soKWlJgVpWpd5I8eEzGqB OG14yqJ6tGhE4K8blRwVrXXl8+NsnKJVRe7u0Ol4aBBrVPDuf1O/lpsTXH3RYzCn0Edv /agPXCwxbOopWu1qNTHRXIWFA+TShsyDSpRQcQ4edGuPDLhZelnc9JRDm2F9DO2dyfwO Wodw== X-Gm-Message-State: AOJu0YzlduBxV29WqejFAuk7NmPRRkagsPDIcsZA+1C79N2qwsOwErcc EqXp1B7gCE4DLlwueD1streSzoGBjxNM8VNf5awosThHs+YpCaHTGbYPvRO0tH4KqbsCl5AryCo 2ylVSJA== X-Received: from pjbbh4.prod.google.com ([2002:a17:90b:484:b0:34c:dd6d:b10e]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:2705:b0:34c:2db6:57d6 with SMTP id 98e67ed59e1d1-353fed866a5mr6317423a91.19.1769649344156; Wed, 28 Jan 2026 17:15:44 -0800 (PST) Reply-To: Sean Christopherson Date: Wed, 28 Jan 2026 17:14:41 -0800 In-Reply-To: <20260129011517.3545883-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20260129011517.3545883-1-seanjc@google.com> X-Mailer: git-send-email 2.53.0.rc1.217.geba53bf80e-goog Message-ID: <20260129011517.3545883-10-seanjc@google.com> Subject: [RFC PATCH v5 09/45] KVM: x86: Rework .free_external_spt() into .reclaim_external_sp() From: Sean Christopherson To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, Kiryl Shutsemau , Sean Christopherson , Paolo Bonzini Cc: linux-kernel@vger.kernel.org, linux-coco@lists.linux.dev, kvm@vger.kernel.org, Kai Huang , Rick Edgecombe , Yan Zhao , Vishal Annapurve , Ackerley Tng , Sagi Shahar , Binbin Wu , Xiaoyao Li , Isaku Yamahata Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Massage .free_external_spt() into .reclaim_external_sp() to free up (pun intended) "free" for actually freeing memory, and to allow TDX to do more than just "free" the S-EPT entry. Specifically, nullify external_spt to leak the S-EPT page if reclaiming the page fails, as that detail and implementation choice has no business living in the TDP MMU. Use "sp" instead of "spt" even though "spt" is arguably more accurate, as "spte" and "spt" are dangerously close in name, and because the key parameter is a kvm_mmu_page, not a pointer to an S-EPT page table. Signed-off-by: Sean Christopherson --- arch/x86/include/asm/kvm-x86-ops.h | 2 +- arch/x86/include/asm/kvm_host.h | 4 ++-- arch/x86/kvm/mmu/tdp_mmu.c | 13 ++----------- arch/x86/kvm/vmx/tdx.c | 27 ++++++++++++--------------- 4 files changed, 17 insertions(+), 29 deletions(-) diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-= x86-ops.h index 57eb1f4832ae..c17cedc485c9 100644 --- a/arch/x86/include/asm/kvm-x86-ops.h +++ b/arch/x86/include/asm/kvm-x86-ops.h @@ -95,8 +95,8 @@ KVM_X86_OP_OPTIONAL_RET0(set_identity_map_addr) KVM_X86_OP_OPTIONAL_RET0(get_mt_mask) KVM_X86_OP(load_mmu_pgd) KVM_X86_OP_OPTIONAL_RET0(set_external_spte) -KVM_X86_OP_OPTIONAL_RET0(free_external_spt) KVM_X86_OP_OPTIONAL(remove_external_spte) +KVM_X86_OP_OPTIONAL(reclaim_external_sp) KVM_X86_OP(has_wbinvd_exit) KVM_X86_OP(get_l2_tsc_offset) KVM_X86_OP(get_l2_tsc_multiplier) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos= t.h index d12ca0f8a348..b35a07ed11fb 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1858,8 +1858,8 @@ struct kvm_x86_ops { u64 mirror_spte); =20 /* Update external page tables for page table about to be freed. */ - int (*free_external_spt)(struct kvm *kvm, gfn_t gfn, enum pg_level level, - void *external_spt); + void (*reclaim_external_sp)(struct kvm *kvm, gfn_t gfn, + struct kvm_mmu_page *sp); =20 /* Update external page table from spte getting removed, and flush TLB. */ void (*remove_external_spte)(struct kvm *kvm, gfn_t gfn, enum pg_level le= vel, diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 27ac520f2a89..18764dbc97ea 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -456,17 +456,8 @@ static void handle_removed_pt(struct kvm *kvm, tdp_pte= p_t pt, bool shared) old_spte, FROZEN_SPTE, level, shared); } =20 - if (is_mirror_sp(sp) && - WARN_ON(kvm_x86_call(free_external_spt)(kvm, base_gfn, sp->role.level, - sp->external_spt))) { - /* - * Failed to free page table page in mirror page table and - * there is nothing to do further. - * Intentionally leak the page to prevent the kernel from - * accessing the encrypted page. - */ - sp->external_spt =3D NULL; - } + if (is_mirror_sp(sp)) + kvm_x86_call(reclaim_external_sp)(kvm, base_gfn, sp); =20 call_rcu(&sp->rcu_head, tdp_mmu_free_sp_rcu_callback); } diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 30494f9ceb31..66bc3ceb5e17 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -1783,27 +1783,24 @@ static void tdx_track(struct kvm *kvm) kvm_make_all_cpus_request(kvm, KVM_REQ_OUTSIDE_GUEST_MODE); } =20 -static int tdx_sept_free_private_spt(struct kvm *kvm, gfn_t gfn, - enum pg_level level, void *private_spt) +static void tdx_sept_reclaim_private_sp(struct kvm *kvm, gfn_t gfn, + struct kvm_mmu_page *sp) { - struct kvm_tdx *kvm_tdx =3D to_kvm_tdx(kvm); - /* - * free_external_spt() is only called after hkid is freed when TD is - * tearing down. * KVM doesn't (yet) zap page table pages in mirror page table while * TD is active, though guest pages mapped in mirror page table could be * zapped during TD is active, e.g. for shared <-> private conversion * and slot move/deletion. + * + * In other words, KVM should only free mirror page tables after the + * TD's hkid is freed, when the TD is being torn down. + * + * If the S-EPT PTE can't be removed for any reason, intentionally leak + * the page to prevent the kernel from accessing the encrypted page. */ - if (KVM_BUG_ON(is_hkid_assigned(kvm_tdx), kvm)) - return -EIO; - - /* - * The HKID assigned to this TD was already freed and cache was - * already flushed. We don't have to flush again. - */ - return tdx_reclaim_page(virt_to_page(private_spt)); + if (KVM_BUG_ON(is_hkid_assigned(to_kvm_tdx(kvm)), kvm) || + tdx_reclaim_page(virt_to_page(sp->external_spt))) + sp->external_spt =3D NULL; } =20 static void tdx_sept_remove_private_spte(struct kvm *kvm, gfn_t gfn, @@ -3617,7 +3614,7 @@ void __init tdx_hardware_setup(void) vt_x86_ops.vm_size =3D max_t(unsigned int, vt_x86_ops.vm_size, sizeof(str= uct kvm_tdx)); =20 vt_x86_ops.set_external_spte =3D tdx_sept_set_private_spte; - vt_x86_ops.free_external_spt =3D tdx_sept_free_private_spt; + vt_x86_ops.reclaim_external_sp =3D tdx_sept_reclaim_private_sp; vt_x86_ops.remove_external_spte =3D tdx_sept_remove_private_spte; vt_x86_ops.protected_apic_has_interrupt =3D tdx_protected_apic_has_interr= upt; } --=20 2.53.0.rc1.217.geba53bf80e-goog