From nobody Mon Sep 15 11:37:37 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 34C9CC61DB3 for ; Thu, 12 Jan 2023 17:12:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240619AbjALRMN (ORCPT ); Thu, 12 Jan 2023 12:12:13 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39986 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236181AbjALRLi (ORCPT ); Thu, 12 Jan 2023 12:11:38 -0500 Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 44EEF7CBC8; Thu, 12 Jan 2023 08:48:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1673542108; x=1705078108; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=KQ1XXWMbHl0MYIHVHrHXAlrv8TLu2HVtsjPVNJosGr4=; b=DyMXvunT/OLYrUiV63qk4rmlccFCeEiFeUb586hMO5JCo9351FlWdf4e fpq4X9M1Rb4QzFncFts4DZWe5dmtpnLa18xwiJyZeMLd8mUwU7CnGcd5q sKqUx6CuA9WueqGxRfPdwdFvR8EL1TrPFaTsK+qoM0RgY/B/PGEAEDm1c iTmO86ipF0jILI85HHtxIHb7av84Z0dBfFt90SOJUdCG+Y+CMOASxzVcd 4olVKbKkrGBf9u360GSaOfzQ/adql/eVjPgzY6+SoxEycCdXF08z6XROr tjXWjYG7zgtXGDxtmz9WymcyX+APM+JRz/YDmgqWqhwnj4IpJN9JRus/l A==; X-IronPort-AV: E=McAfee;i="6500,9779,10588"; a="323816295" X-IronPort-AV: E=Sophos;i="5.97,211,1669104000"; d="scan'208";a="323816295" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jan 2023 08:44:17 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10588"; a="986658335" X-IronPort-AV: E=Sophos;i="5.97,211,1669104000"; d="scan'208";a="986658335" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Jan 2023 08:44:16 -0800 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack , Xiaoyao Li Subject: [RFC PATCH v3 05/16] KVM: TDX: Pass size to reclaim_page() Date: Thu, 12 Jan 2023 08:43:57 -0800 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Xiaoyao Li A 2MB large page can be tdh_mem_page_aug()'ed to TD directly. In this case, it needs to reclaim and clear the page as 2MB size. Signed-off-by: Xiaoyao Li Signed-off-by: Isaku Yamahata --- arch/x86/kvm/vmx/tdx.c | 26 ++++++++++++++++---------- 1 file changed, 16 insertions(+), 10 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 1bc07dfe765a..8bc8fd7f28eb 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -184,14 +184,17 @@ void tdx_hardware_disable(void) tdx_disassociate_vp(&tdx->vcpu); } =20 -static void tdx_clear_page(unsigned long page_pa) +static void tdx_clear_page(unsigned long page_pa, int size) { const void *zero_page =3D (const void *) __va(page_to_phys(ZERO_PAGE(0))); void *page =3D __va(page_pa); unsigned long i; =20 + WARN_ON_ONCE(size % PAGE_SIZE); + if (!static_cpu_has(X86_FEATURE_MOVDIR64B)) { - clear_page(page); + for (i =3D 0; i < size; i +=3D PAGE_SIZE) + clear_page(page + i); return; } =20 @@ -205,7 +208,7 @@ static void tdx_clear_page(unsigned long page_pa) * The cache line could be poisoned (even without MKTME-i), clear the * poison bit. */ - for (i =3D 0; i < PAGE_SIZE; i +=3D 64) + for (i =3D 0; i < size; i +=3D 64) movdir64b(page + i, zero_page); /* * MOVDIR64B store uses WC buffer. Prevent following memory reads @@ -214,7 +217,8 @@ static void tdx_clear_page(unsigned long page_pa) __mb(); } =20 -static int tdx_reclaim_page(hpa_t pa, bool do_wb, u16 hkid) +static int tdx_reclaim_page(hpa_t pa, enum pg_level level, + bool do_wb, u16 hkid) { struct tdx_module_output out; u64 err; @@ -232,8 +236,10 @@ static int tdx_reclaim_page(hpa_t pa, bool do_wb, u16 = hkid) pr_tdx_error(TDH_PHYMEM_PAGE_RECLAIM, err, &out); return -EIO; } + /* out.r8 =3D=3D tdx sept page level */ + WARN_ON_ONCE(out.r8 !=3D pg_level_to_tdx_sept_level(level)); =20 - if (do_wb) { + if (do_wb && level =3D=3D PG_LEVEL_4K) { /* * Only TDR page gets into this path. No contention is expected * because of the last page of TD. @@ -245,7 +251,7 @@ static int tdx_reclaim_page(hpa_t pa, bool do_wb, u16 h= kid) } } =20 - tdx_clear_page(pa); + tdx_clear_page(pa, KVM_HPAGE_SIZE(level)); return 0; } =20 @@ -259,7 +265,7 @@ static void tdx_reclaim_td_page(unsigned long td_page_p= a) * was already flushed by TDH.PHYMEM.CACHE.WB before here, So * cache doesn't need to be flushed again. */ - if (WARN_ON(tdx_reclaim_page(td_page_pa, false, 0))) + if (WARN_ON(tdx_reclaim_page(td_page_pa, PG_LEVEL_4K, false, 0))) /* If reclaim failed, leak the page. */ return; free_page((unsigned long)__va(td_page_pa)); @@ -436,7 +442,7 @@ void tdx_vm_free(struct kvm *kvm) * while operating on TD (Especially reclaiming TDCS). Cache flush with * TDX global HKID is needed. */ - if (tdx_reclaim_page(kvm_tdx->tdr_pa, true, tdx_global_keyid)) + if (tdx_reclaim_page(kvm_tdx->tdr_pa, PG_LEVEL_4K, true, tdx_global_keyid= )) return; =20 free_page((unsigned long)__va(kvm_tdx->tdr_pa)); @@ -1427,7 +1433,7 @@ static int tdx_sept_drop_private_spte(struct kvm *kvm= , gfn_t gfn, * The HKID assigned to this TD was already freed and cache * was already flushed. We don't have to flush again. */ - err =3D tdx_reclaim_page(hpa, false, 0); + err =3D tdx_reclaim_page(hpa, level, false, 0); if (KVM_BUG_ON(err, kvm)) return -EIO; tdx_unpin(kvm, pfn); @@ -1566,7 +1572,7 @@ static int tdx_sept_free_private_spt(struct kvm *kvm,= gfn_t gfn, * already flushed. We don't have to flush again. */ if (!is_hkid_assigned(kvm_tdx)) - return tdx_reclaim_page(__pa(private_spt), false, 0); + return tdx_reclaim_page(__pa(private_spt), PG_LEVEL_4K, false, 0); =20 /* * free_private_spt() is (obviously) called when a shadow page is being --=20 2.25.1