From nobody Sun Feb 8 09:26:55 2026 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.11]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3A3F1254869; Fri, 2 May 2025 13:08:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.11 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746191324; cv=none; b=tQ7U8heDc/tYq2gvOxSbgNYM7vjLIo/o4BoJZ/0iKGyLfTCqCa47u2DonzOjwHjql2H9d3PO9z7gJ5va0NPe1Al2J87GBInbYT89ppEaQ3TW15MwqtmV9DDW/SRvRdYpXkDHmeIdDs1GwGGFJJAJIFuUph9hvb/x1yCltymv2ek= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746191324; c=relaxed/simple; bh=C9kO+sFs+Azqq39AMtBkkj6l8k1vV7yvPw74WxauvEg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=SBiVc8AiBppe6SkfikGB4b+Ov7EKJpDXxxdff3/6HfHcH9+6MkiqIOcOVFNW37PGsVxTbssRDauQiW8kYnyhui2l425W903tG+ocfRDpx9ao1U/fgKSMLDIEcnm/9eDaeIsW7biapQf64yhFH1MXqdavFr4o/xinlFaSd9Kg+t8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.helo=mgamail.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=aKivw+jW; arc=none smtp.client-ip=192.198.163.11 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.helo=mgamail.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="aKivw+jW" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1746191323; x=1777727323; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=C9kO+sFs+Azqq39AMtBkkj6l8k1vV7yvPw74WxauvEg=; b=aKivw+jW++XQodfH2R/kFVrrAn8tp6F94uZQPAlon3RMIEq78YT6Zykl Yqwnems8cQvEzmmqxr41+tyLITOWennJ1lnmGeQwULfZEZP6GK2PLBtJY i777EI6VX9dNrX3sbgBIRwutnAVDq9O2kU4QHgk7bbLlnl6ed83wh4IBa PiO7G85jJ/pFGbDvGUWVZnCKhS7gXvy18BVfNe6abkpXXGXOg7XXgp7qI jLpiXalQwaeRMzghNmdK/Z9siNZXfxOqRX+PKCKAKrupBFlj3+lGuWG4Z 47gx1zK13Ep8+1E6YNQk78iptNn5ipLUqbcFTXRcyJ8GPU6Z9azLLfzFm w==; X-CSE-ConnectionGUID: VBE3TFEvQb6wQ5nHj2brxg== X-CSE-MsgGUID: m7dDWvx1REavZRfwjYB7rw== X-IronPort-AV: E=McAfee;i="6700,10204,11421"; a="58495260" X-IronPort-AV: E=Sophos;i="6.15,256,1739865600"; d="scan'208";a="58495260" Received: from fmviesa003.fm.intel.com ([10.60.135.143]) by fmvoesa105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 May 2025 06:08:41 -0700 X-CSE-ConnectionGUID: A3EKNsEKT5qkuyTiGC645w== X-CSE-MsgGUID: fMDb+YDqRMO63jByr9l2rw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,256,1739865600"; d="scan'208";a="138657776" Received: from black.fi.intel.com ([10.237.72.28]) by fmviesa003.fm.intel.com with ESMTP; 02 May 2025 06:08:37 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id 0E6DA9F; Fri, 02 May 2025 16:08:36 +0300 (EEST) From: "Kirill A. Shutemov" To: pbonzini@redhat.com, seanjc@google.com Cc: rick.p.edgecombe@intel.com, isaku.yamahata@intel.com, kai.huang@intel.com, yan.y.zhao@intel.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, kvm@vger.kernel.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [RFC, PATCH 01/12] x86/virt/tdx: Allocate page bitmap for Dynamic PAMT Date: Fri, 2 May 2025 16:08:17 +0300 Message-ID: <20250502130828.4071412-2-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250502130828.4071412-1-kirill.shutemov@linux.intel.com> References: <20250502130828.4071412-1-kirill.shutemov@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The Physical Address Metadata Table (PAMT) holds TDX metadata for physical memory and must be allocated by the kernel during TDX module initialization. The exact size of the required PAMT memory is determined by the TDX module and may vary between TDX module versions, but currently it is approximately 0.4% of the system memory. This is a significant commitment, especially if it is not known upfront whether the machine will run any TDX guests. The Dynamic PAMT feature reduces static PAMT allocations. PAMT_1G and PAMT_2M levels are still allocated on TDX module initialization, but the PAMT_4K level is allocated dynamically, reducing static allocations to approximately 0.004% of the system memory. With Dynamic PAMT, the kernel no longer needs to allocate PAMT_4K on boot, but instead must allocate a page bitmap. The TDX module determines how many bits per page need to be allocated (currently it is 1). Allocate the bitmap if the kernel boots on a machine with Dynamic PAMT. Signed-off-by: Kirill A. Shutemov --- arch/x86/include/asm/tdx.h | 5 +++++ arch/x86/include/asm/tdx_global_metadata.h | 1 + arch/x86/virt/vmx/tdx/tdx.c | 23 ++++++++++++++++++++- arch/x86/virt/vmx/tdx/tdx_global_metadata.c | 3 +++ 4 files changed, 31 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h index 26ffc792e673..9701876d4e16 100644 --- a/arch/x86/include/asm/tdx.h +++ b/arch/x86/include/asm/tdx.h @@ -125,6 +125,11 @@ int tdx_enable(void); const char *tdx_dump_mce_info(struct mce *m); const struct tdx_sys_info *tdx_get_sysinfo(void); =20 +static inline bool tdx_supports_dynamic_pamt(const struct tdx_sys_info *sy= sinfo) +{ + return false; /* To be enabled when kernel is ready */ +} + int tdx_guest_keyid_alloc(void); u32 tdx_get_nr_guest_keyids(void); void tdx_guest_keyid_free(unsigned int keyid); diff --git a/arch/x86/include/asm/tdx_global_metadata.h b/arch/x86/include/= asm/tdx_global_metadata.h index 060a2ad744bf..5eb808b23997 100644 --- a/arch/x86/include/asm/tdx_global_metadata.h +++ b/arch/x86/include/asm/tdx_global_metadata.h @@ -15,6 +15,7 @@ struct tdx_sys_info_tdmr { u16 pamt_4k_entry_size; u16 pamt_2m_entry_size; u16 pamt_1g_entry_size; + u8 pamt_page_bitmap_entry_bits; }; =20 struct tdx_sys_info_td_ctrl { diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c index f5e2a937c1e7..c8bfd765e451 100644 --- a/arch/x86/virt/vmx/tdx/tdx.c +++ b/arch/x86/virt/vmx/tdx/tdx.c @@ -470,6 +470,18 @@ static unsigned long tdmr_get_pamt_sz(struct tdmr_info= *tdmr, int pgsz, return pamt_sz; } =20 +static unsigned long tdmr_get_pamt_bitmap_sz(struct tdmr_info *tdmr) +{ + unsigned long pamt_sz, nr_pamt_entries; + int bits_per_entry; + + bits_per_entry =3D tdx_sysinfo.tdmr.pamt_page_bitmap_entry_bits; + nr_pamt_entries =3D tdmr->size >> PAGE_SHIFT; + pamt_sz =3D DIV_ROUND_UP(nr_pamt_entries * bits_per_entry, BITS_PER_BYTE); + + return ALIGN(pamt_sz, PAGE_SIZE); +} + /* * Locate a NUMA node which should hold the allocation of the @tdmr * PAMT. This node will have some memory covered by the TDMR. The @@ -522,7 +534,16 @@ static int tdmr_set_up_pamt(struct tdmr_info *tdmr, * and the total PAMT size. */ tdmr_pamt_size =3D 0; - for (pgsz =3D TDX_PS_4K; pgsz < TDX_PS_NR; pgsz++) { + pgsz =3D TDX_PS_4K; + + /* With Dynamic PAMT, PAMT_4K is replaced with a bitmap */ + if (tdx_supports_dynamic_pamt(&tdx_sysinfo)) { + pamt_size[pgsz] =3D tdmr_get_pamt_bitmap_sz(tdmr); + tdmr_pamt_size +=3D pamt_size[pgsz]; + pgsz++; + } + + for (; pgsz < TDX_PS_NR; pgsz++) { pamt_size[pgsz] =3D tdmr_get_pamt_sz(tdmr, pgsz, pamt_entry_size[pgsz]); tdmr_pamt_size +=3D pamt_size[pgsz]; diff --git a/arch/x86/virt/vmx/tdx/tdx_global_metadata.c b/arch/x86/virt/vm= x/tdx/tdx_global_metadata.c index 13ad2663488b..683925bcc9eb 100644 --- a/arch/x86/virt/vmx/tdx/tdx_global_metadata.c +++ b/arch/x86/virt/vmx/tdx/tdx_global_metadata.c @@ -33,6 +33,9 @@ static int get_tdx_sys_info_tdmr(struct tdx_sys_info_tdmr= *sysinfo_tdmr) sysinfo_tdmr->pamt_2m_entry_size =3D val; if (!ret && !(ret =3D read_sys_metadata_field(0x9100000100000012, &val))) sysinfo_tdmr->pamt_1g_entry_size =3D val; + if (!ret && tdx_supports_dynamic_pamt(&tdx_sysinfo) && + !(ret =3D read_sys_metadata_field(0x9100000100000013, &val))) + sysinfo_tdmr->pamt_page_bitmap_entry_bits =3D val; =20 return ret; } --=20 2.47.2 From nobody Sun Feb 8 09:26:55 2026 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.11]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 42104253F34; Fri, 2 May 2025 13:08:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.11 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746191323; cv=none; b=rE+7BQ50oJHJcyMQSeprdkTcclBu6A1NModfEeOyU6iEPAXzhTCbQE9dfU+0avM1TXUKEaI4/v/GlSpYtHfnn9bBcAablmSLg7ZX0JZkuoxHGh+9qJ5POhVvArk66YbOvMY5/tBYrz1RhshHT249evXXZ41RrX66kLaQzM99UHc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746191323; c=relaxed/simple; bh=koLJLqsdXQLlsDvglmbIut6PeC1m9lsxMFhLJpepDOg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=gNJPopn3rePMQI49JIpsPS96bEVdgvJFIo+LlWP0m2KTDycTbYc1aA+lKjrG669Vph2O9MFGmwv6ZTOdiQQHfgnIq3YVal4opQo/XFpEXWtY/fjjJGEWAV4LCujtDqwcxfYn8wtmFjtzBRgd3fQIn41eiIaPNbZ5CUF4glXW83s= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.helo=mgamail.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=E3+yNsXY; arc=none smtp.client-ip=192.198.163.11 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.helo=mgamail.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="E3+yNsXY" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1746191321; x=1777727321; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=koLJLqsdXQLlsDvglmbIut6PeC1m9lsxMFhLJpepDOg=; b=E3+yNsXYj2zmPZORQ6ihdD98+LpecCujWDRwoc/MN9vONsIU9tarhgc9 HXl4gBvFjzHZHW/JkFG0OsacGX/WuvQPNsEylmZJCLUYqjDZoqdqYrF8G 56VnRvFG0GAs5HG6du5TKRt49jOhu1ZinyiONvqTYGVmbrhzBze/RgDLH 3nXuDZq4zKEv46PqaG/gLhGycBtaYN37drmMg2Ww0HHrNXl0ZuczA8mPf 33JO0Teguud8aiBZm0wKLqADvhLLFA++GlErZlro9CasfcjzVKofEuFg8 5adWKaPjtpXfN2+JR7vn5lloyVnJeEhWeeUFV2VwrPuB2tVSC3FAFynls w==; X-CSE-ConnectionGUID: NUA9G5x1TAiI3tp7WaI4Cw== X-CSE-MsgGUID: tKX+6cM+SaiskHVdyPWP7g== X-IronPort-AV: E=McAfee;i="6700,10204,11421"; a="58495250" X-IronPort-AV: E=Sophos;i="6.15,256,1739865600"; d="scan'208";a="58495250" Received: from fmviesa003.fm.intel.com ([10.60.135.143]) by fmvoesa105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 May 2025 06:08:41 -0700 X-CSE-ConnectionGUID: X0dBsm4/T1OHQX1RNbYFZQ== X-CSE-MsgGUID: IBErx+7zSeuwKQ7RN/B3RQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,256,1739865600"; d="scan'208";a="138657775" Received: from black.fi.intel.com ([10.237.72.28]) by fmviesa003.fm.intel.com with ESMTP; 02 May 2025 06:08:37 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id 1E3241AC; Fri, 02 May 2025 16:08:36 +0300 (EEST) From: "Kirill A. Shutemov" To: pbonzini@redhat.com, seanjc@google.com Cc: rick.p.edgecombe@intel.com, isaku.yamahata@intel.com, kai.huang@intel.com, yan.y.zhao@intel.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, kvm@vger.kernel.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [RFC, PATCH 02/12] x86/virt/tdx: Allocate reference counters for PAMT memory Date: Fri, 2 May 2025 16:08:18 +0300 Message-ID: <20250502130828.4071412-3-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250502130828.4071412-1-kirill.shutemov@linux.intel.com> References: <20250502130828.4071412-1-kirill.shutemov@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The PAMT memory holds metadata for TDX-protected memory. With Dynamic PAMT, PAMT_4K is allocated on demand. The kernel supplies the TDX module with a page pair that covers 2M of host physical memory. The kernel must provide this page pair before using pages from the range for TDX. If this is not done, any SEAMCALL that attempts to use the memory will fail. Allocate reference counters for every 2M range to track PAMT memory usage. This is necessary to accurately determine when PAMT memory needs to be allocated and when it can be freed. This allocation will consume 2MiB for every 1TiB of physical memory. Tracking PAMT memory usage on the kernel side duplicates what TDX module does. It is possible to avoid this by lazily allocating PAMT memory on SEAMCALL failure and freeing it based on hints provided by the TDX module when the last user of PAMT memory is no longer present. However, this approach complicates serialization. The TDX module takes locks when dealing with PAMT: a shared lock on any SEAMCALL that uses explicit HPA and an exclusive lock on PAMT.ADD and PAMT.REMOVE. Any SEAMCALL that uses explicit HPA as an operand may fail if it races with PAMT.ADD/REMOVE. Since PAMT is a global resource, to prevent failure the kernel would need global locking (per-TD is not sufficient). Or, it has to retry on TDX_OPERATOR_BUSY. Both options are not ideal, and tracking PAMT usage on the kernel side seems like a reasonable alternative. Signed-off-by: Kirill A. Shutemov --- arch/x86/virt/vmx/tdx/tdx.c | 113 +++++++++++++++++++++++++++++++++++- 1 file changed, 111 insertions(+), 2 deletions(-) diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c index c8bfd765e451..00e07a0c908a 100644 --- a/arch/x86/virt/vmx/tdx/tdx.c +++ b/arch/x86/virt/vmx/tdx/tdx.c @@ -29,6 +29,7 @@ #include #include #include +#include #include #include #include @@ -50,6 +51,8 @@ static DEFINE_PER_CPU(bool, tdx_lp_initialized); =20 static struct tdmr_info_list tdx_tdmr_list; =20 +static atomic_t *pamt_refcounts; + static enum tdx_module_status_t tdx_module_status; static DEFINE_MUTEX(tdx_module_lock); =20 @@ -1035,9 +1038,108 @@ static int config_global_keyid(void) return ret; } =20 +atomic_t *tdx_get_pamt_refcount(unsigned long hpa) +{ + return &pamt_refcounts[hpa / PMD_SIZE]; +} +EXPORT_SYMBOL_GPL(tdx_get_pamt_refcount); + +static int pamt_refcount_populate(pte_t *pte, unsigned long addr, void *da= ta) +{ + unsigned long vaddr; + pte_t entry; + + if (!pte_none(ptep_get(pte))) + return 0; + + vaddr =3D __get_free_page(GFP_KERNEL | __GFP_ZERO); + if (!vaddr) + return -ENOMEM; + + entry =3D pfn_pte(PFN_DOWN(__pa(vaddr)), PAGE_KERNEL); + + spin_lock(&init_mm.page_table_lock); + if (pte_none(ptep_get(pte))) + set_pte_at(&init_mm, addr, pte, entry); + else + free_page(vaddr); + spin_unlock(&init_mm.page_table_lock); + + return 0; +} + +static int pamt_refcount_depopulate(pte_t *pte, unsigned long addr, + void *data) +{ + unsigned long vaddr; + + vaddr =3D (unsigned long)__va(PFN_PHYS(pte_pfn(ptep_get(pte)))); + + spin_lock(&init_mm.page_table_lock); + if (!pte_none(ptep_get(pte))) { + pte_clear(&init_mm, addr, pte); + free_page(vaddr); + } + spin_unlock(&init_mm.page_table_lock); + + return 0; +} + +static int alloc_tdmr_pamt_refcount(struct tdmr_info *tdmr) +{ + unsigned long start, end; + + start =3D (unsigned long)tdx_get_pamt_refcount(tdmr->base); + end =3D (unsigned long)tdx_get_pamt_refcount(tdmr->base + tdmr->size); + start =3D round_down(start, PAGE_SIZE); + end =3D round_up(end, PAGE_SIZE); + + return apply_to_page_range(&init_mm, start, end - start, + pamt_refcount_populate, NULL); +} + +static int init_pamt_metadata(void) +{ + size_t size =3D max_pfn / PTRS_PER_PTE * sizeof(*pamt_refcounts); + struct vm_struct *area; + + if (!tdx_supports_dynamic_pamt(&tdx_sysinfo)) + return 0; + + /* + * Reserve vmalloc range for PAMT reference counters. It covers all + * physical address space up to max_pfn. It is going to be populated + * from init_tdmr() only for present memory that available for TDX use. + */ + area =3D get_vm_area(size, VM_IOREMAP); + if (!area) + return -ENOMEM; + + pamt_refcounts =3D area->addr; + return 0; +} + +static void free_pamt_metadata(void) +{ + size_t size =3D max_pfn / PTRS_PER_PTE * sizeof(*pamt_refcounts); + + size =3D round_up(size, PAGE_SIZE); + apply_to_existing_page_range(&init_mm, + (unsigned long)pamt_refcounts, + size, pamt_refcount_depopulate, + NULL); + vfree(pamt_refcounts); + pamt_refcounts =3D NULL; +} + static int init_tdmr(struct tdmr_info *tdmr) { u64 next; + int ret; + + ret =3D alloc_tdmr_pamt_refcount(tdmr); + if (ret) + return ret; =20 /* * Initializing a TDMR can be time consuming. To avoid long @@ -1048,7 +1150,6 @@ static int init_tdmr(struct tdmr_info *tdmr) struct tdx_module_args args =3D { .rcx =3D tdmr->base, }; - int ret; =20 ret =3D seamcall_prerr_ret(TDH_SYS_TDMR_INIT, &args); if (ret) @@ -1134,10 +1235,15 @@ static int init_tdx_module(void) if (ret) goto err_reset_pamts; =20 + /* Reserve vmalloc range for PAMT reference counters */ + ret =3D init_pamt_metadata(); + if (ret) + goto err_reset_pamts; + /* Initialize TDMRs to complete the TDX module initialization */ ret =3D init_tdmrs(&tdx_tdmr_list); if (ret) - goto err_reset_pamts; + goto err_free_pamt_metadata; =20 pr_info("%lu KB allocated for PAMT\n", tdmrs_count_pamt_kb(&tdx_tdmr_list= )); =20 @@ -1149,6 +1255,9 @@ static int init_tdx_module(void) put_online_mems(); return ret; =20 +err_free_pamt_metadata: + free_pamt_metadata(); + err_reset_pamts: /* * Part of PAMTs may already have been initialized by the --=20 2.47.2 From nobody Sun Feb 8 09:26:55 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 81B36254876; Fri, 2 May 2025 13:08:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746191324; cv=none; b=lF0ABKiZbP8MJfFphSU7X7xTtlX5eNsCimBc8KSDjlPkc/QsBuVigCwDpJa8FXXhEKWtOcSkqER24pP24u112JyP0Ucm8vn0n5RmNciWLPwWTGpS55W5dk13d96vQYkBAflY2VMQIhDSgiIezaeOjT8n8X/Ra9ty+XDyCEXwiDI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746191324; c=relaxed/simple; bh=Mxg5gZR5NtpEeMPs/+XTRuULzQ14JvabH1BZykostAY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=SrSn4d7vNFKjyuGLKq27MCcMRfqAhPhduSxFdsumtch1M/1gbjpW0G7/+KCNOfCoBGzb09W24AM0KL132JE8LsI82ybWY0TvQ334bjSR2li74uRyaVoaZEJnM1oGCWU/97ChvGzfxq/tLR7N2Y9zofBiHENSlTPfpubdauo1NRk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.helo=mgamail.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=ZV6fTgoc; arc=none smtp.client-ip=198.175.65.18 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.helo=mgamail.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="ZV6fTgoc" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1746191322; x=1777727322; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Mxg5gZR5NtpEeMPs/+XTRuULzQ14JvabH1BZykostAY=; b=ZV6fTgocLjTo74Yq+G7gyPNOB/s8awKiJTqiyT2OqquHRlCkHnOHbuTx BUtdxSTxhgCG31krLS8aOLFHpmVNuu1RfkYWTKKUFePHBggCXgMHJLlof t/3qcGrIOrwsatuPeIwNvxXfwCwlApYzl9g35zsLEeKsbGhJ0+F+ZsS0Y XRuA8q55SLf9mc6mpvUCDg9rOojCvLjeuhOY/OGwF9/r2+5gGv1vkvdAj O1LHZMCrCKUxaaC5znfM/r0CW9AllkiGAh1/bX+f0qF9Gi/Hz8oFfbAiF kgL1FqCipUSzM2YjysPHwp8Plqr+iGe73iq6Umnohnhf5Xl+kZKWq4D2W Q==; X-CSE-ConnectionGUID: lK0rZVdHT1Wt0o08ZGtiTA== X-CSE-MsgGUID: M3KAOI5hRjOTOI3dzHybSA== X-IronPort-AV: E=McAfee;i="6700,10204,11421"; a="48012952" X-IronPort-AV: E=Sophos;i="6.15,256,1739865600"; d="scan'208";a="48012952" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa110.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 May 2025 06:08:40 -0700 X-CSE-ConnectionGUID: bKCrRft4Td2TwOs2mpn5LQ== X-CSE-MsgGUID: G6KG1unLTdKZdFp4ECvWwg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,256,1739865600"; d="scan'208";a="157871062" Received: from black.fi.intel.com ([10.237.72.28]) by fmviesa002.fm.intel.com with ESMTP; 02 May 2025 06:08:37 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id 2A6241AE; Fri, 02 May 2025 16:08:36 +0300 (EEST) From: "Kirill A. Shutemov" To: pbonzini@redhat.com, seanjc@google.com Cc: rick.p.edgecombe@intel.com, isaku.yamahata@intel.com, kai.huang@intel.com, yan.y.zhao@intel.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, kvm@vger.kernel.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [RFC, PATCH 03/12] x86/virt/tdx: Add wrappers for TDH.PHYMEM.PAMT.ADD/REMOVE Date: Fri, 2 May 2025 16:08:19 +0300 Message-ID: <20250502130828.4071412-4-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250502130828.4071412-1-kirill.shutemov@linux.intel.com> References: <20250502130828.4071412-1-kirill.shutemov@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" On a system with Dynamic PAMT enabled, the kernel must allocate memory for PAMT_4K as needed and reclaim it when it is no longer in use. The TDX module requires space to store 16 bytes of metadata per page or 8k for every 2M range of physical memory. The TDX module takes this 8k of memory as a pair of 4k pages. These pages do not need to be contiguous. The number of pages needed to cover 2M range can grow if size of PAMT entry increases. tdx_nr_pamt_pages() reports needed number of pages. TDH.PHYMEM.PAMT.ADD populates PAMT_4K for a given HPA. The kernel must provide addresses for two pages, covering a 2M range starting from HPA. TDH.PHYMEM.PAMT.REMOVE withdraws PAMT_4K memory for a given HPA, returning the addresses of the pages used for PAMT_4K before the call. Add wrappers for these SEAMCALLs. Signed-off-by: Kirill A. Shutemov --- arch/x86/include/asm/tdx.h | 9 ++++++++ arch/x86/virt/vmx/tdx/tdx.c | 45 +++++++++++++++++++++++++++++++++++++ arch/x86/virt/vmx/tdx/tdx.h | 2 ++ 3 files changed, 56 insertions(+) diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h index 9701876d4e16..a134cf3ecd17 100644 --- a/arch/x86/include/asm/tdx.h +++ b/arch/x86/include/asm/tdx.h @@ -130,6 +130,11 @@ static inline bool tdx_supports_dynamic_pamt(const str= uct tdx_sys_info *sysinfo) return false; /* To be enabled when kernel is ready */ } =20 +static inline int tdx_nr_pamt_pages(const struct tdx_sys_info *sysinfo) +{ + return sysinfo->tdmr.pamt_4k_entry_size * PTRS_PER_PTE / PAGE_SIZE; +} + int tdx_guest_keyid_alloc(void); u32 tdx_get_nr_guest_keyids(void); void tdx_guest_keyid_free(unsigned int keyid); @@ -197,6 +202,9 @@ u64 tdh_mem_page_remove(struct tdx_td *td, u64 gpa, u64= level, u64 *ext_err1, u6 u64 tdh_phymem_cache_wb(bool resume); u64 tdh_phymem_page_wbinvd_tdr(struct tdx_td *td); u64 tdh_phymem_page_wbinvd_hkid(u64 hkid, struct page *page); +u64 tdh_phymem_pamt_add(unsigned long hpa, struct list_head *pamt_pages); +u64 tdh_phymem_pamt_remove(unsigned long hpa, struct list_head *pamt_pages= ); + #else static inline void tdx_init(void) { } static inline int tdx_cpu_enable(void) { return -ENODEV; } @@ -204,6 +212,7 @@ static inline int tdx_enable(void) { return -ENODEV; } static inline u32 tdx_get_nr_guest_keyids(void) { return 0; } static inline const char *tdx_dump_mce_info(struct mce *m) { return NULL; } static inline const struct tdx_sys_info *tdx_get_sysinfo(void) { return NU= LL; } +static inline int tdx_nr_pamt_pages(const struct tdx_sys_info *sysinfo) { = return 0; } #endif /* CONFIG_INTEL_TDX_HOST */ =20 #endif /* !__ASSEMBLER__ */ diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c index 00e07a0c908a..29defdb7f6bc 100644 --- a/arch/x86/virt/vmx/tdx/tdx.c +++ b/arch/x86/virt/vmx/tdx/tdx.c @@ -1999,3 +1999,48 @@ u64 tdh_phymem_page_wbinvd_hkid(u64 hkid, struct pag= e *page) return seamcall(TDH_PHYMEM_PAGE_WBINVD, &args); } EXPORT_SYMBOL_GPL(tdh_phymem_page_wbinvd_hkid); + +u64 tdh_phymem_pamt_add(unsigned long hpa, struct list_head *pamt_pages) +{ + struct tdx_module_args args =3D { + .rcx =3D hpa, + }; + struct page *page; + u64 *p; + + WARN_ON_ONCE(!IS_ALIGNED(hpa & PAGE_MASK, PMD_SIZE)); + + p =3D &args.rdx; + list_for_each_entry(page, pamt_pages, lru) { + *p =3D page_to_phys(page); + p++; + } + + return seamcall(TDH_PHYMEM_PAMT_ADD, &args); +} +EXPORT_SYMBOL_GPL(tdh_phymem_pamt_add); + +u64 tdh_phymem_pamt_remove(unsigned long hpa, struct list_head *pamt_pages) +{ + struct tdx_module_args args =3D { + .rcx =3D hpa, + }; + struct page *page; + u64 *p, ret; + + WARN_ON_ONCE(!IS_ALIGNED(hpa & PAGE_MASK, PMD_SIZE)); + + ret =3D seamcall_ret(TDH_PHYMEM_PAMT_REMOVE, &args); + if (ret) + return ret; + + p =3D &args.rdx; + for (int i =3D 0; i < tdx_nr_pamt_pages(&tdx_sysinfo); i++) { + page =3D phys_to_page(*p); + list_add(&page->lru, pamt_pages); + p++; + } + + return ret; +} +EXPORT_SYMBOL_GPL(tdh_phymem_pamt_remove); diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h index 82bb82be8567..46c4214b79fb 100644 --- a/arch/x86/virt/vmx/tdx/tdx.h +++ b/arch/x86/virt/vmx/tdx/tdx.h @@ -46,6 +46,8 @@ #define TDH_PHYMEM_PAGE_WBINVD 41 #define TDH_VP_WR 43 #define TDH_SYS_CONFIG 45 +#define TDH_PHYMEM_PAMT_ADD 58 +#define TDH_PHYMEM_PAMT_REMOVE 59 =20 /* * SEAMCALL leaf: --=20 2.47.2 From nobody Sun Feb 8 09:26:55 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D6DE42550D0; Fri, 2 May 2025 13:08:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746191325; cv=none; b=jS2EBbUxtB0FVAAYG7GbMS46RqJdMTAjPntRTYMxec1HsWghGadW0f30OMFB1XWFeX179adiwOmMSXM9rdtZx81ghzu/K9qJdO20+tiZgpC11kSmK8Fzqvu0N5WD0BppQd6l4g73XQvqsIosxnXVSgqWc17FqRXh5AmxU3z0pGg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746191325; c=relaxed/simple; bh=EqV3ExI1VOsG9HGpNc51/naP8y50Je1jebPzL5aK9WQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=khA9lRHZIGtKEFhXwg4VRSrkNagUwtsZ8GzlRz0Hp+tczrppU8LHDeHpLWjImswjJU68ths2nSzV8uyVF7gRszchwa2/JjMS/IaWd6LED+1S+jDnRdquM0MqfMIQAGdPknzdgcu+NM9kHCfdbqnEAzGSQR0qJe1kD/gQ7SnjPa0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.helo=mgamail.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=bf5D+MRl; arc=none smtp.client-ip=198.175.65.18 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.helo=mgamail.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="bf5D+MRl" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1746191323; x=1777727323; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=EqV3ExI1VOsG9HGpNc51/naP8y50Je1jebPzL5aK9WQ=; b=bf5D+MRleu/1eJZfMObDEDin+TSA2OBJH+IhkDqhTZkfkgyccVIrc3UL EX6q9ba3THtlkOYMp1Rw32oGaTOkNj2GabwZbvLIkRiM+g6+aDFwHVoVm jjm7AvCldicvvndxTmHRHpFiH6vmV8vzALx92dM3UdwssVs9NkGATltWy FjmfGWx7aX9cBCAp9xGabmPrOMtV8PEcV/fibS2uoHfGxN0rxKGrfQ0bC sU3l8VXE/TK6j0Ah1Fc2sJSuy2Rk0b0oKiKK44QSzr0FXAtEE8p/Io9Y1 bo+R1SS+SdItVMF5v9scNgUm5G/cytbDg+VzQ1cgZcDIrsbOMw47wCNpw Q==; X-CSE-ConnectionGUID: FmSb/EtsRDyt8tv3JSW5fA== X-CSE-MsgGUID: Nha+lH50QX+Jtt7ActaY+w== X-IronPort-AV: E=McAfee;i="6700,10204,11421"; a="48012959" X-IronPort-AV: E=Sophos;i="6.15,256,1739865600"; d="scan'208";a="48012959" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa110.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 May 2025 06:08:40 -0700 X-CSE-ConnectionGUID: Vw+bkxLgTUeNsFvPjSwu6w== X-CSE-MsgGUID: N3uxSHj2RuKGuH9iHRnRTQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,256,1739865600"; d="scan'208";a="157871065" Received: from black.fi.intel.com ([10.237.72.28]) by fmviesa002.fm.intel.com with ESMTP; 02 May 2025 06:08:37 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id 3AB451D2; Fri, 02 May 2025 16:08:36 +0300 (EEST) From: "Kirill A. Shutemov" To: pbonzini@redhat.com, seanjc@google.com Cc: rick.p.edgecombe@intel.com, isaku.yamahata@intel.com, kai.huang@intel.com, yan.y.zhao@intel.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, kvm@vger.kernel.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [RFC, PATCH 04/12] x86/virt/tdx: Account PAMT memory and print if in /proc/meminfo Date: Fri, 2 May 2025 16:08:20 +0300 Message-ID: <20250502130828.4071412-5-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250502130828.4071412-1-kirill.shutemov@linux.intel.com> References: <20250502130828.4071412-1-kirill.shutemov@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" PAMT memory can add up to substantial portion of system memory. Account these pages and print them into /proc/meminfo as TDX. When no TD running PAMT memory consumption suppose to be zero. Signed-off-by: Kirill A. Shutemov --- arch/x86/include/asm/set_memory.h | 2 ++ arch/x86/include/asm/tdx.h | 2 ++ arch/x86/mm/Makefile | 2 ++ arch/x86/mm/meminfo.c | 11 +++++++++++ arch/x86/mm/pat/set_memory.c | 2 +- arch/x86/virt/vmx/tdx/tdx.c | 26 ++++++++++++++++++++++++-- 6 files changed, 42 insertions(+), 3 deletions(-) create mode 100644 arch/x86/mm/meminfo.c diff --git a/arch/x86/include/asm/set_memory.h b/arch/x86/include/asm/set_m= emory.h index 8d9f1c9aaa4c..e729e9f86e67 100644 --- a/arch/x86/include/asm/set_memory.h +++ b/arch/x86/include/asm/set_memory.h @@ -90,6 +90,8 @@ int set_direct_map_default_noflush(struct page *page); int set_direct_map_valid_noflush(struct page *page, unsigned nr, bool vali= d); bool kernel_page_present(struct page *page); =20 +void direct_pages_meminfo(struct seq_file *m); + extern int kernel_set_to_readonly; =20 #endif /* _ASM_X86_SET_MEMORY_H */ diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h index a134cf3ecd17..8091bf5b43cc 100644 --- a/arch/x86/include/asm/tdx.h +++ b/arch/x86/include/asm/tdx.h @@ -205,6 +205,7 @@ u64 tdh_phymem_page_wbinvd_hkid(u64 hkid, struct page *= page); u64 tdh_phymem_pamt_add(unsigned long hpa, struct list_head *pamt_pages); u64 tdh_phymem_pamt_remove(unsigned long hpa, struct list_head *pamt_pages= ); =20 +void tdx_meminfo(struct seq_file *m); #else static inline void tdx_init(void) { } static inline int tdx_cpu_enable(void) { return -ENODEV; } @@ -213,6 +214,7 @@ static inline u32 tdx_get_nr_guest_keyids(void) { retur= n 0; } static inline const char *tdx_dump_mce_info(struct mce *m) { return NULL; } static inline const struct tdx_sys_info *tdx_get_sysinfo(void) { return NU= LL; } static inline int tdx_nr_pamt_pages(const struct tdx_sys_info *sysinfo) { = return 0; } +static inline void tdx_meminfo(struct seq_file *m) {} #endif /* CONFIG_INTEL_TDX_HOST */ =20 #endif /* !__ASSEMBLER__ */ diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile index 32035d5be5a0..311d60801871 100644 --- a/arch/x86/mm/Makefile +++ b/arch/x86/mm/Makefile @@ -38,6 +38,8 @@ CFLAGS_fault.o :=3D -I $(src)/../include/asm/trace =20 obj-$(CONFIG_X86_32) +=3D pgtable_32.o iomap_32.o =20 +obj-$(CONFIG_PROC_FS) +=3D meminfo.o + obj-$(CONFIG_HUGETLB_PAGE) +=3D hugetlbpage.o obj-$(CONFIG_PTDUMP) +=3D dump_pagetables.o obj-$(CONFIG_PTDUMP_DEBUGFS) +=3D debug_pagetables.o diff --git a/arch/x86/mm/meminfo.c b/arch/x86/mm/meminfo.c new file mode 100644 index 000000000000..7bdb5df014de --- /dev/null +++ b/arch/x86/mm/meminfo.c @@ -0,0 +1,11 @@ +#include +#include + +#include +#include + +void arch_report_meminfo(struct seq_file *m) +{ + direct_pages_meminfo(m); + tdx_meminfo(m); +} diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c index def3d9284254..59432b92e80e 100644 --- a/arch/x86/mm/pat/set_memory.c +++ b/arch/x86/mm/pat/set_memory.c @@ -118,7 +118,7 @@ static void collapse_page_count(int level) direct_pages_count[level - 1] -=3D PTRS_PER_PTE; } =20 -void arch_report_meminfo(struct seq_file *m) +void direct_pages_meminfo(struct seq_file *m) { seq_printf(m, "DirectMap4k: %8lu kB\n", direct_pages_count[PG_LEVEL_4K] << 2); diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c index 29defdb7f6bc..74bd81acef7b 100644 --- a/arch/x86/virt/vmx/tdx/tdx.c +++ b/arch/x86/virt/vmx/tdx/tdx.c @@ -2000,13 +2000,28 @@ u64 tdh_phymem_page_wbinvd_hkid(u64 hkid, struct pa= ge *page) } EXPORT_SYMBOL_GPL(tdh_phymem_page_wbinvd_hkid); =20 +static atomic_long_t tdx_pamt_count =3D ATOMIC_LONG_INIT(0); + +void tdx_meminfo(struct seq_file *m) +{ + unsigned long usage; + + if (!cpu_feature_enabled(X86_FEATURE_TDX_HOST_PLATFORM)) + return; + + usage =3D atomic_long_read(&tdx_pamt_count) * + tdx_nr_pamt_pages(&tdx_sysinfo) * PAGE_SIZE / SZ_1K; + + seq_printf(m, "TDX: %8lu kB\n", usage); +} + u64 tdh_phymem_pamt_add(unsigned long hpa, struct list_head *pamt_pages) { struct tdx_module_args args =3D { .rcx =3D hpa, }; struct page *page; - u64 *p; + u64 *p, ret; =20 WARN_ON_ONCE(!IS_ALIGNED(hpa & PAGE_MASK, PMD_SIZE)); =20 @@ -2016,7 +2031,12 @@ u64 tdh_phymem_pamt_add(unsigned long hpa, struct li= st_head *pamt_pages) p++; } =20 - return seamcall(TDH_PHYMEM_PAMT_ADD, &args); + ret =3D seamcall(TDH_PHYMEM_PAMT_ADD, &args); + + if (!ret) + atomic_long_inc(&tdx_pamt_count); + + return ret; } EXPORT_SYMBOL_GPL(tdh_phymem_pamt_add); =20 @@ -2034,6 +2054,8 @@ u64 tdh_phymem_pamt_remove(unsigned long hpa, struct = list_head *pamt_pages) if (ret) return ret; =20 + atomic_long_dec(&tdx_pamt_count); + p =3D &args.rdx; for (int i =3D 0; i < tdx_nr_pamt_pages(&tdx_sysinfo); i++) { page =3D phys_to_page(*p); --=20 2.47.2 From nobody Sun Feb 8 09:26:55 2026 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.11]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 284B425A64E; Fri, 2 May 2025 13:08:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.11 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746191336; cv=none; b=ABph4fKH+HNF05e8s+9SphhMgn4HzhsFUhfmoGJWrbgB2LfF9y1f5CQPl6U/2oMbIMw5HMZHtJr4U4QvOGjNj9yKH5p3ds3W93n4z6joe5K14lrLv7SikWLs/KPRKCx0HJZsxujyNLGaCmt9Jq1o6yXFSrkPL0s9QNB+GWbEpeA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746191336; c=relaxed/simple; bh=6KdVYsEBJ/j6nE8ePLlnDGqB/MG4p1xHnjjBkVTUNtA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=sJDu0CSKhkkaIIcRBpScm76RpwiAM6WgdBYY4NcIOOwnv/YQmj/N0eJL15zdE1TcLiPide/mqEttVdWC5LFjMjcPhTCuGnQPn6chJz+eBZ5PNQ444Zkwfma6kN7GNkElX1GfHYdn9m+TAfzLcj7PwZlVYHnzKo6px8znNK2mMwA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.helo=mgamail.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=jm5XCcJt; arc=none smtp.client-ip=192.198.163.11 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.helo=mgamail.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="jm5XCcJt" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1746191333; x=1777727333; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=6KdVYsEBJ/j6nE8ePLlnDGqB/MG4p1xHnjjBkVTUNtA=; b=jm5XCcJtpLljNy9hcSTBtrgLDvRHB9fXBIXmLeVw7WOYn0Vx9KhGR16H dPMvT1mPHQCWusmkg7d04lq7WVO+CUGuF3aYaGBkTwAWUgdgdZHgmUcWh fQiE4bBTlvb3RyLNGvd8HCak7SZF0IiHrX1hjnQfFWUrPr1FFnco2SjND IQqt9uq9znnPja3wYbCYcQVcELQZQTSzJEZNb8bkvqit1QuHiF5VS38ZW tFmtGUn4UwOS1Pau7lcKL5eDrVlmhouCV2GwUOu2qJhxeCBE15TXGIVd+ AOPXi4yPHJPGoykHgDzJ00+fs+h0azksCPw+TGXhVtGtf2NllYXvd0hmz A==; X-CSE-ConnectionGUID: y5nuyg0BQtyYxs/imU/SRA== X-CSE-MsgGUID: jEc0ryaXTCeElw3fXTlOkw== X-IronPort-AV: E=McAfee;i="6700,10204,11421"; a="58495280" X-IronPort-AV: E=Sophos;i="6.15,256,1739865600"; d="scan'208";a="58495280" Received: from fmviesa003.fm.intel.com ([10.60.135.143]) by fmvoesa105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 May 2025 06:08:51 -0700 X-CSE-ConnectionGUID: 8tZWGe7yT5a8MWhcxhzp+w== X-CSE-MsgGUID: lJjrM7SrRlu9dwzC/wPHZQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,256,1739865600"; d="scan'208";a="138657786" Received: from black.fi.intel.com ([10.237.72.28]) by fmviesa003.fm.intel.com with ESMTP; 02 May 2025 06:08:41 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id 40BB51A1; Fri, 02 May 2025 16:08:36 +0300 (EEST) From: "Kirill A. Shutemov" To: pbonzini@redhat.com, seanjc@google.com Cc: rick.p.edgecombe@intel.com, isaku.yamahata@intel.com, kai.huang@intel.com, yan.y.zhao@intel.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, kvm@vger.kernel.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [RFC, PATCH 05/12] KVM: TDX: Add tdx_pamt_get()/put() helpers Date: Fri, 2 May 2025 16:08:21 +0300 Message-ID: <20250502130828.4071412-6-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250502130828.4071412-1-kirill.shutemov@linux.intel.com> References: <20250502130828.4071412-1-kirill.shutemov@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce a pair of helpers to allocate and free memory for a given 2M range. The range is represented by struct page for any memory in the range and the PAMT memory by a list of pages. Use per-2M refcounts to detect when PAMT memory has to be allocated and when it can be freed. pamt_lock spinlock serializes against races between multiple tdx_pamt_add() as well as tdx_pamt_add() vs tdx_pamt_put(). Signed-off-by: Kirill A. Shutemov --- arch/x86/include/asm/tdx.h | 2 + arch/x86/kvm/vmx/tdx.c | 123 +++++++++++++++++++++++++++++++++++ arch/x86/kvm/vmx/tdx_errno.h | 1 + 3 files changed, 126 insertions(+) diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h index 8091bf5b43cc..42449c054938 100644 --- a/arch/x86/include/asm/tdx.h +++ b/arch/x86/include/asm/tdx.h @@ -135,6 +135,8 @@ static inline int tdx_nr_pamt_pages(const struct tdx_sy= s_info *sysinfo) return sysinfo->tdmr.pamt_4k_entry_size * PTRS_PER_PTE / PAGE_SIZE; } =20 +atomic_t *tdx_get_pamt_refcount(unsigned long hpa); + int tdx_guest_keyid_alloc(void); u32 tdx_get_nr_guest_keyids(void); void tdx_guest_keyid_free(unsigned int keyid); diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index b952bc673271..ea7e2d93fb44 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -207,6 +207,10 @@ static bool tdx_operand_busy(u64 err) return (err & TDX_SEAMCALL_STATUS_MASK) =3D=3D TDX_OPERAND_BUSY; } =20 +static bool tdx_hpa_range_not_free(u64 err) +{ + return (err & TDX_SEAMCALL_STATUS_MASK) =3D=3D TDX_HPA_RANGE_NOT_FREE; +} =20 /* * A per-CPU list of TD vCPUs associated with a given CPU. @@ -276,6 +280,125 @@ static inline void tdx_disassociate_vp(struct kvm_vcp= u *vcpu) vcpu->cpu =3D -1; } =20 +static DEFINE_SPINLOCK(pamt_lock); + +static void tdx_free_pamt_pages(struct list_head *pamt_pages) +{ + struct page *page; + + while ((page =3D list_first_entry_or_null(pamt_pages, struct page, lru)))= { + list_del(&page->lru); + __free_page(page); + } +} + +static int tdx_alloc_pamt_pages(struct list_head *pamt_pages) +{ + for (int i =3D 0; i < tdx_nr_pamt_pages(tdx_sysinfo); i++) { + struct page *page =3D alloc_page(GFP_KERNEL); + if (!page) + goto fail; + list_add(&page->lru, pamt_pages); + } + return 0; +fail: + tdx_free_pamt_pages(pamt_pages); + return -ENOMEM; +} + +static int tdx_pamt_add(atomic_t *pamt_refcount, unsigned long hpa, + struct list_head *pamt_pages) +{ + u64 err; + + hpa =3D ALIGN_DOWN(hpa, SZ_2M); + + spin_lock(&pamt_lock); + + /* Lost race to other tdx_pamt_add() */ + if (atomic_read(pamt_refcount) !=3D 0) { + atomic_inc(pamt_refcount); + spin_unlock(&pamt_lock); + tdx_free_pamt_pages(pamt_pages); + return 0; + } + + err =3D tdh_phymem_pamt_add(hpa | TDX_PS_2M, pamt_pages); + + if (err) + tdx_free_pamt_pages(pamt_pages); + + /* + * tdx_hpa_range_not_free() is true if current task won race + * against tdx_pamt_put(). + */ + if (err && !tdx_hpa_range_not_free(err)) { + spin_unlock(&pamt_lock); + pr_tdx_error(TDH_PHYMEM_PAMT_ADD, err); + return -EIO; + } + + atomic_set(pamt_refcount, 1); + spin_unlock(&pamt_lock); + return 0; +} + +static int tdx_pamt_get(struct page *page) +{ + unsigned long hpa =3D page_to_phys(page); + atomic_t *pamt_refcount; + LIST_HEAD(pamt_pages); + + if (!tdx_supports_dynamic_pamt(tdx_sysinfo)) + return 0; + + pamt_refcount =3D tdx_get_pamt_refcount(hpa); + WARN_ON_ONCE(atomic_read(pamt_refcount) < 0); + + if (atomic_inc_not_zero(pamt_refcount)) + return 0; + + if (tdx_alloc_pamt_pages(&pamt_pages)) + return -ENOMEM; + + return tdx_pamt_add(pamt_refcount, hpa, &pamt_pages); +} + +static void tdx_pamt_put(struct page *page) +{ + unsigned long hpa =3D page_to_phys(page); + atomic_t *pamt_refcount; + LIST_HEAD(pamt_pages); + u64 err; + + if (!tdx_supports_dynamic_pamt(tdx_sysinfo)) + return; + + hpa =3D ALIGN_DOWN(hpa, SZ_2M); + + pamt_refcount =3D tdx_get_pamt_refcount(hpa); + if (!atomic_dec_and_test(pamt_refcount)) + return; + + spin_lock(&pamt_lock); + + /* Lost race against tdx_pamt_add()? */ + if (atomic_read(pamt_refcount) !=3D 0) { + spin_unlock(&pamt_lock); + return; + } + + err =3D tdh_phymem_pamt_remove(hpa | TDX_PS_2M, &pamt_pages); + spin_unlock(&pamt_lock); + + if (err) { + pr_tdx_error(TDH_PHYMEM_PAMT_REMOVE, err); + return; + } + + tdx_free_pamt_pages(&pamt_pages); +} + static void tdx_clear_page(struct page *page) { const void *zero_page =3D (const void *) page_to_virt(ZERO_PAGE(0)); diff --git a/arch/x86/kvm/vmx/tdx_errno.h b/arch/x86/kvm/vmx/tdx_errno.h index 6ff4672c4181..c8a471d6b991 100644 --- a/arch/x86/kvm/vmx/tdx_errno.h +++ b/arch/x86/kvm/vmx/tdx_errno.h @@ -18,6 +18,7 @@ #define TDX_OPERAND_BUSY 0x8000020000000000ULL #define TDX_PREVIOUS_TLB_EPOCH_BUSY 0x8000020100000000ULL #define TDX_PAGE_METADATA_INCORRECT 0xC000030000000000ULL +#define TDX_HPA_RANGE_NOT_FREE 0xC000030400000000ULL #define TDX_VCPU_NOT_ASSOCIATED 0x8000070200000000ULL #define TDX_KEY_GENERATION_FAILED 0x8000080000000000ULL #define TDX_KEY_STATE_INCORRECT 0xC000081100000000ULL --=20 2.47.2 From nobody Sun Feb 8 09:26:55 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A1BFB255F34; Fri, 2 May 2025 13:08:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746191327; cv=none; b=XY08t6ls7qQzyqY79rpiivraBn5LeRZVkbw+Eu59q8S8bL93yNPDl181QKsuGgtgNrpWcnJX/fs5mxDaoYav2aTq2WvMqwNrug8gaQmp/ucU+rgdhOlZfLZuNExFtsF6ZQeV4X9CU6fu7KZDo0S+16ioOHIcmnNE9eAVrL7+oOU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746191327; c=relaxed/simple; bh=WpMLEMIa0sSO19qSA7DLFIJ7BdXR/BIZhnHX4GL9a7s=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=c3Aq+6T39A4KMump8JGpQnnaXB1fo8BZBl4bcZkcw6f31onS4aIV+HRWxlfP5vqUtu/epJ9wRcSu3QQWa8YYbaIraW7pXtkMHu8jLGlfWK4q65Hs1iESViT+oc34iLIdCH2zx+de7r2Nh7wZb8K6Gx2BlkQuRSrKp1FcWIk8hww= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.helo=mgamail.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=QKmvliSg; arc=none smtp.client-ip=198.175.65.18 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.helo=mgamail.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="QKmvliSg" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1746191325; x=1777727325; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=WpMLEMIa0sSO19qSA7DLFIJ7BdXR/BIZhnHX4GL9a7s=; b=QKmvliSgx9ztXJLd+pSPN+P1swqoYPvNoHgd7wmEpdFYm433D6JVMEzv /Y80+yE26iovMtiOdiZExeAjAuoAoBrkwgEsZXPQyfcZV1ZsQ/p0k4IWv vaxGSOC9SK0PAUgy5m6FooGtaljbaghUpoxBQSGETA1d59r/wyNe0ody/ 4ZT6hMVEdAsd+ri25VN1lQsberW59nge2kSYmuaIAQ7bPbM29ZWOn16jB j/Vwl2rSfdyyu80ruHerzTfcP7ZZPPUyyzffqUY5wvMqXA4kUDiRHc/o7 WGeWx142fNBirj5cU6dJOeh3fe77UKaO967iJG08x/Un6Zin/SJKaVkaF A==; X-CSE-ConnectionGUID: CShqPHvmTK6FBPd8oLkxSg== X-CSE-MsgGUID: +4oyVmvJQv6Ww11nZrau7Q== X-IronPort-AV: E=McAfee;i="6700,10204,11421"; a="48012978" X-IronPort-AV: E=Sophos;i="6.15,256,1739865600"; d="scan'208";a="48012978" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa110.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 May 2025 06:08:45 -0700 X-CSE-ConnectionGUID: iB9ZE6uCTBSfLjMNuHdmFw== X-CSE-MsgGUID: sufZPuJYRA61EPn5SPwXTg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,256,1739865600"; d="scan'208";a="157871084" Received: from black.fi.intel.com ([10.237.72.28]) by fmviesa002.fm.intel.com with ESMTP; 02 May 2025 06:08:41 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id 589A1260; Fri, 02 May 2025 16:08:36 +0300 (EEST) From: "Kirill A. Shutemov" To: pbonzini@redhat.com, seanjc@google.com Cc: rick.p.edgecombe@intel.com, isaku.yamahata@intel.com, kai.huang@intel.com, yan.y.zhao@intel.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, kvm@vger.kernel.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [RFC, PATCH 06/12] KVM: TDX: Allocate PAMT memory in __tdx_td_init() Date: Fri, 2 May 2025 16:08:22 +0300 Message-ID: <20250502130828.4071412-7-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250502130828.4071412-1-kirill.shutemov@linux.intel.com> References: <20250502130828.4071412-1-kirill.shutemov@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Allocate PAMT memory for TDH.MNG.CREATE and TDH.MNG.ADDCX. PAMT memory that is associated with pages successfully added to the TD with TDH.MNG.ADDCX will be removed in tdx_reclaim_page() on tdx_reclaim_control_page(). Signed-off-by: Kirill A. Shutemov --- arch/x86/kvm/vmx/tdx.c | 41 +++++++++++++++++++++++++++++++---------- 1 file changed, 31 insertions(+), 10 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index ea7e2d93fb44..59bbae2df485 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -399,6 +399,31 @@ static void tdx_pamt_put(struct page *page) tdx_free_pamt_pages(&pamt_pages); } =20 +static struct page *tdx_alloc_page(void) +{ + struct page *page; + + page =3D alloc_page(GFP_KERNEL); + if (!page) + return NULL; + + if (tdx_pamt_get(page)) { + __free_page(page); + return NULL; + } + + return page; +} + +static void tdx_free_page(struct page *page) +{ + if (!page) + return; + + tdx_pamt_put(page); + __free_page(page); +} + static void tdx_clear_page(struct page *page) { const void *zero_page =3D (const void *) page_to_virt(ZERO_PAGE(0)); @@ -2499,7 +2524,7 @@ static int __tdx_td_init(struct kvm *kvm, struct td_p= arams *td_params, =20 atomic_inc(&nr_configured_hkid); =20 - tdr_page =3D alloc_page(GFP_KERNEL); + tdr_page =3D tdx_alloc_page(); if (!tdr_page) goto free_hkid; =20 @@ -2512,7 +2537,7 @@ static int __tdx_td_init(struct kvm *kvm, struct td_p= arams *td_params, goto free_tdr; =20 for (i =3D 0; i < kvm_tdx->td.tdcs_nr_pages; i++) { - tdcs_pages[i] =3D alloc_page(GFP_KERNEL); + tdcs_pages[i] =3D tdx_alloc_page(); if (!tdcs_pages[i]) goto free_tdcs; } @@ -2633,10 +2658,8 @@ static int __tdx_td_init(struct kvm *kvm, struct td_= params *td_params, teardown: /* Only free pages not yet added, so start at 'i' */ for (; i < kvm_tdx->td.tdcs_nr_pages; i++) { - if (tdcs_pages[i]) { - __free_page(tdcs_pages[i]); - tdcs_pages[i] =3D NULL; - } + tdx_free_page(tdcs_pages[i]); + tdcs_pages[i] =3D NULL; } if (!kvm_tdx->td.tdcs_pages) kfree(tdcs_pages); @@ -2652,15 +2675,13 @@ static int __tdx_td_init(struct kvm *kvm, struct td= _params *td_params, =20 free_tdcs: for (i =3D 0; i < kvm_tdx->td.tdcs_nr_pages; i++) { - if (tdcs_pages[i]) - __free_page(tdcs_pages[i]); + tdx_free_page(tdcs_pages[i]); } kfree(tdcs_pages); kvm_tdx->td.tdcs_pages =3D NULL; =20 free_tdr: - if (tdr_page) - __free_page(tdr_page); + tdx_free_page(tdr_page); kvm_tdx->td.tdr_page =3D 0; =20 free_hkid: --=20 2.47.2 From nobody Sun Feb 8 09:26:55 2026 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.11]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6A068255E37; Fri, 2 May 2025 13:08:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.11 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746191333; cv=none; b=Nq5/g5aAC6tlCyy84TmS6Ct/v3n7sDw0CTz/2py3T7R6bb/2zYVgPCxcKy28YxKY1ExnNX3g1pMtRC8638G8qopjqxWm35ivhjG8dF3oWLQcGQjuHRNIXmk0xmaVZ6yfBKunKEHmhWP07YlziaFJO9C+v5CVDvBxlN395NkOy+k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746191333; c=relaxed/simple; bh=j5Y6GYyeSlhNDi2dLeXJ+8OeWOwUzdTAtd6KqruAJps=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Zlh89EZ0Y03cShev8aVoKVt6XFrC7YU3fE6NCkVb/lwm2F9rg5dY3EXcD/kL2hsK70mDIp9tatRO22TrRjk5Av4+EUzpkel3gWYyDtbyHX32Y/nhH6zLxz6ha5a6zTI5D25sYJ1OpCG2wSCFMNIDMMIYC7IwCF7rMympgbl4mxo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.helo=mgamail.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=lmUQ5Ows; arc=none smtp.client-ip=192.198.163.11 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.helo=mgamail.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="lmUQ5Ows" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1746191332; x=1777727332; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=j5Y6GYyeSlhNDi2dLeXJ+8OeWOwUzdTAtd6KqruAJps=; b=lmUQ5OwslejGX+aJN4uHYO25g77foDFUdO6aZyvHZEzeLpgQpEL1rDTl G7LDea95EDJrqKINHa7164wbahifp2yKX1HE+kgnZU+7xqFIbBAkYKUE4 eKzxG7SGqO0gPl9GvJ9bftELW3OOQ3NImVTGgDAXCpUMMQiLaTkOHpJDK Fd5vxyazLcJEXGpNCihTbyOlMKsHYWkW+9H9YD7QX7xm1MEHmXTOCJl6f A3e6ApGw0sTn0wrGNkI5mi+qqzaicCZG2jX3n10Yd3mS9q9+nSCYRm+F3 Dz91rOhecfCkDu0MtfrwhfHNtUeyv3f58UNxurFSXFTOSoYwmP5lb6MR6 A==; X-CSE-ConnectionGUID: 6SHVNk1kQo+n0HfVBjBgHA== X-CSE-MsgGUID: erRWX5CHTN2/zhKQeg10pw== X-IronPort-AV: E=McAfee;i="6700,10204,11421"; a="58495273" X-IronPort-AV: E=Sophos;i="6.15,256,1739865600"; d="scan'208";a="58495273" Received: from fmviesa003.fm.intel.com ([10.60.135.143]) by fmvoesa105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 May 2025 06:08:51 -0700 X-CSE-ConnectionGUID: 0Z/06JEGRTaayqBgyOFGwg== X-CSE-MsgGUID: 695OBPPCQACgw1ezPztR6w== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,256,1739865600"; d="scan'208";a="138657788" Received: from black.fi.intel.com ([10.237.72.28]) by fmviesa003.fm.intel.com with ESMTP; 02 May 2025 06:08:41 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id 656B2325; Fri, 02 May 2025 16:08:36 +0300 (EEST) From: "Kirill A. Shutemov" To: pbonzini@redhat.com, seanjc@google.com Cc: rick.p.edgecombe@intel.com, isaku.yamahata@intel.com, kai.huang@intel.com, yan.y.zhao@intel.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, kvm@vger.kernel.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [RFC, PATCH 07/12] KVM: TDX: Allocate PAMT memory in tdx_td_vcpu_init() Date: Fri, 2 May 2025 16:08:23 +0300 Message-ID: <20250502130828.4071412-8-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250502130828.4071412-1-kirill.shutemov@linux.intel.com> References: <20250502130828.4071412-1-kirill.shutemov@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Allocate PAMT memory for TDH.VP.CREATE and TDH.VP.ADDCX. PAMT memory that is associated with pages successfully added to the TD with TDH.VP.ADDCX will be removed in tdx_reclaim_page() on tdx_reclaim_control_page(). Signed-off-by: Kirill A. Shutemov --- arch/x86/kvm/vmx/tdx.c | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 59bbae2df485..18c4ae00cd8d 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -2983,7 +2983,7 @@ static int tdx_td_vcpu_init(struct kvm_vcpu *vcpu, u6= 4 vcpu_rcx) int ret, i; u64 err; =20 - page =3D alloc_page(GFP_KERNEL); + page =3D tdx_alloc_page(); if (!page) return -ENOMEM; tdx->vp.tdvpr_page =3D page; @@ -2996,7 +2996,7 @@ static int tdx_td_vcpu_init(struct kvm_vcpu *vcpu, u6= 4 vcpu_rcx) } =20 for (i =3D 0; i < kvm_tdx->td.tdcx_nr_pages; i++) { - page =3D alloc_page(GFP_KERNEL); + page =3D tdx_alloc_page(); if (!page) { ret =3D -ENOMEM; goto free_tdcx; @@ -3020,7 +3020,7 @@ static int tdx_td_vcpu_init(struct kvm_vcpu *vcpu, u6= 4 vcpu_rcx) * method, but the rest are freed here. */ for (; i < kvm_tdx->td.tdcx_nr_pages; i++) { - __free_page(tdx->vp.tdcx_pages[i]); + tdx_free_page(tdx->vp.tdcx_pages[i]); tdx->vp.tdcx_pages[i] =3D NULL; } return -EIO; @@ -3039,16 +3039,15 @@ static int tdx_td_vcpu_init(struct kvm_vcpu *vcpu, = u64 vcpu_rcx) =20 free_tdcx: for (i =3D 0; i < kvm_tdx->td.tdcx_nr_pages; i++) { - if (tdx->vp.tdcx_pages[i]) - __free_page(tdx->vp.tdcx_pages[i]); + tdx_free_page(tdx->vp.tdcx_pages[i]); tdx->vp.tdcx_pages[i] =3D NULL; } kfree(tdx->vp.tdcx_pages); tdx->vp.tdcx_pages =3D NULL; =20 free_tdvpr: - if (tdx->vp.tdvpr_page) - __free_page(tdx->vp.tdvpr_page); + tdx_free_page(tdx->vp.tdvpr_page); + tdx->vp.tdvpr_page =3D 0; =20 return ret; --=20 2.47.2 From nobody Sun Feb 8 09:26:55 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 928A42566E3; Fri, 2 May 2025 13:08:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746191329; cv=none; b=uZPy254i3uIfKhD8i4rKU032ronlIoYLnVvmPXXgvE+GEb3gRe3qjUcL8e8mD+hkoSp8MVAEDBq6aPekLtBFvjks7itn4nReSs38rxLRVEyHOVHZL5liRuMmgcxO67yAOpfrtd1aJUYJnzMu+qQsU5NxrNg7Po9MXMPgmBIT0T4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746191329; c=relaxed/simple; bh=EA7G0eYYU+Fzt6HFAAauj8vAuuhBCOFdQK2LisRs3Ro=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=FZwXJjqCJgukuDnveaF4yyPp6uR1xkNj9rueuZoJRVcyQd4Vt6jU+T4K4Zi21wjuBWMTBVlYyEw+WDwMbNR4ZeZFbgxE9P7SVfGzmvx4SuLx9GAXd8MXny0Bfj7tsM85yncmWlmwwbnLgO89HEBp9Y9Q0R3v06nTw8c9zoQumZM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.helo=mgamail.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=mVsLqJH4; arc=none smtp.client-ip=198.175.65.18 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.helo=mgamail.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="mVsLqJH4" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1746191327; x=1777727327; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=EA7G0eYYU+Fzt6HFAAauj8vAuuhBCOFdQK2LisRs3Ro=; b=mVsLqJH42TMHW0e96BlpoN7tBYLWI3OBtSp3A83IBUw944VHjaBkqZSN 7kmOFvV1XxooctEYfHdfLOCV6DS1eZXNjr8GhqftaWvBXJA6ZHKWxL/TH IvdK0yo7VqXCF2YMOw2zu+jKftXfB/slrrDwQZgqHQ1EHbT57b01LDsgU IXhPUSiUFX7Vqn+THS41aIv7jEumc91nWP7KlWv1zHdZUty76fr3duFvL n3QuxB9bH93xvLbAI6Vb6fWpirQyq+lwFBSGyjHfGYQLTR2cGKuB/SwTc XHbip9Obq/Yu5ZzVobPjBlM/NtUIPjvoDGIOyx2swX9RsL3I5eiQjMxG4 g==; X-CSE-ConnectionGUID: RHT6O2fBQt2be54mHDDREw== X-CSE-MsgGUID: 0tERy6K8SlOdD0Yms5fLkQ== X-IronPort-AV: E=McAfee;i="6700,10204,11421"; a="48013003" X-IronPort-AV: E=Sophos;i="6.15,256,1739865600"; d="scan'208";a="48013003" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa110.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 May 2025 06:08:45 -0700 X-CSE-ConnectionGUID: JDHTPLdtRNylud+KIJttpg== X-CSE-MsgGUID: EBR059RXSo2bT5ZO4EBvmg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,256,1739865600"; d="scan'208";a="157871092" Received: from black.fi.intel.com ([10.237.72.28]) by fmviesa002.fm.intel.com with ESMTP; 02 May 2025 06:08:41 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id 7235735D; Fri, 02 May 2025 16:08:36 +0300 (EEST) From: "Kirill A. Shutemov" To: pbonzini@redhat.com, seanjc@google.com Cc: rick.p.edgecombe@intel.com, isaku.yamahata@intel.com, kai.huang@intel.com, yan.y.zhao@intel.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, kvm@vger.kernel.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [RFC, PATCH 08/12] KVM: x86/tdp_mmu: Add phys_prepare() and phys_cleanup() to kvm_x86_ops Date: Fri, 2 May 2025 16:08:24 +0300 Message-ID: <20250502130828.4071412-9-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250502130828.4071412-1-kirill.shutemov@linux.intel.com> References: <20250502130828.4071412-1-kirill.shutemov@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The functions kvm_x86_ops::link_external_spt() and kvm_x86_ops::set_external_spte() are used to assign new memory to a VM. When using TDX with Dynamic PAMT enabled, the assigned memory must be covered by PAMT. The new function kvm_x86_ops::phys_prepare() is called before link_external_spt() and set_external_spte() to ensure that the memory is ready to be assigned to the virtual machine. In the case of TDX, it makes sure that the memory is covered by PAMT. kvm_x86_ops::phys_prepare() is called in a context where struct kvm_vcpu is available, allowing the implementation to allocate memory from a per-VCPU pool. The function kvm_x86_ops::phys_cleanup() frees PAMT memory in case of failure. Signed-off-by: Kirill A. Shutemov --- arch/x86/include/asm/kvm-x86-ops.h | 2 ++ arch/x86/include/asm/kvm_host.h | 3 ++ arch/x86/kvm/mmu/tdp_mmu.c | 47 +++++++++++++++++++++++++++--- 3 files changed, 48 insertions(+), 4 deletions(-) diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-= x86-ops.h index 79406bf07a1c..37081d04e82f 100644 --- a/arch/x86/include/asm/kvm-x86-ops.h +++ b/arch/x86/include/asm/kvm-x86-ops.h @@ -99,6 +99,8 @@ KVM_X86_OP_OPTIONAL(link_external_spt) KVM_X86_OP_OPTIONAL(set_external_spte) KVM_X86_OP_OPTIONAL(free_external_spt) KVM_X86_OP_OPTIONAL(remove_external_spte) +KVM_X86_OP_OPTIONAL(phys_prepare) +KVM_X86_OP_OPTIONAL(phys_cleanup) KVM_X86_OP(has_wbinvd_exit) KVM_X86_OP(get_l2_tsc_offset) KVM_X86_OP(get_l2_tsc_multiplier) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos= t.h index 6c06f3d6e081..91958c55f918 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1813,6 +1813,9 @@ struct kvm_x86_ops { int (*remove_external_spte)(struct kvm *kvm, gfn_t gfn, enum pg_level lev= el, kvm_pfn_t pfn_for_gfn); =20 + int (*phys_prepare)(struct kvm_vcpu *vcpu, kvm_pfn_t pfn); + void (*phys_cleanup)(kvm_pfn_t pfn); + bool (*has_wbinvd_exit)(void); =20 u64 (*get_l2_tsc_offset)(struct kvm_vcpu *vcpu); diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 405874f4d088..f6c836b2e6fc 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -1137,6 +1137,26 @@ void kvm_tdp_mmu_invalidate_roots(struct kvm *kvm, } } =20 +static int tdp_mmu_install_spte(struct kvm_vcpu *vcpu, + struct tdp_iter *iter, + u64 spte) +{ + kvm_pfn_t pfn =3D 0; + int ret =3D 0; + + if (is_mirror_sptep(iter->sptep) && !is_frozen_spte(spte)) { + pfn =3D spte_to_pfn(spte); + ret =3D static_call(kvm_x86_phys_prepare)(vcpu, pfn); + } + if (ret) + return ret; + ret =3D tdp_mmu_set_spte_atomic(vcpu->kvm, iter, spte); + if (pfn && ret) + static_call(kvm_x86_phys_cleanup)(pfn); + + return ret; +} + /* * Installs a last-level SPTE to handle a TDP page fault. * (NPT/EPT violation/misconfiguration) @@ -1170,7 +1190,7 @@ static int tdp_mmu_map_handle_target_level(struct kvm= _vcpu *vcpu, =20 if (new_spte =3D=3D iter->old_spte) ret =3D RET_PF_SPURIOUS; - else if (tdp_mmu_set_spte_atomic(vcpu->kvm, iter, new_spte)) + else if (tdp_mmu_install_spte(vcpu, iter, new_spte)) return RET_PF_RETRY; else if (is_shadow_present_pte(iter->old_spte) && (!is_last_spte(iter->old_spte, iter->level) || @@ -1211,7 +1231,7 @@ static int tdp_mmu_map_handle_target_level(struct kvm= _vcpu *vcpu, * Returns: 0 if the new page table was installed. Non-0 if the page table * could not be installed (e.g. the atomic compare-exchange faile= d). */ -static int tdp_mmu_link_sp(struct kvm *kvm, struct tdp_iter *iter, +static int __tdp_mmu_link_sp(struct kvm *kvm, struct tdp_iter *iter, struct kvm_mmu_page *sp, bool shared) { u64 spte =3D make_nonleaf_spte(sp->spt, !kvm_ad_enabled); @@ -1230,6 +1250,25 @@ static int tdp_mmu_link_sp(struct kvm *kvm, struct t= dp_iter *iter, return 0; } =20 +static int tdp_mmu_link_sp(struct kvm_vcpu *vcpu, struct tdp_iter *iter, + struct kvm_mmu_page *sp, bool shared) +{ + kvm_pfn_t pfn =3D 0; + int ret =3D 0; + + if (sp->external_spt) { + pfn =3D __pa(sp->external_spt) >> PAGE_SHIFT; + ret =3D static_call(kvm_x86_phys_prepare)(vcpu, pfn); + if (ret) + return ret; + } + ret =3D __tdp_mmu_link_sp(vcpu->kvm, iter, sp, shared); + if (pfn && ret) + static_call(kvm_x86_phys_cleanup)(pfn); + + return ret; +} + static int tdp_mmu_split_huge_page(struct kvm *kvm, struct tdp_iter *iter, struct kvm_mmu_page *sp, bool shared); =20 @@ -1288,7 +1327,7 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, struct kvm= _page_fault *fault) KVM_BUG_ON(is_mirror_sptep(iter.sptep), vcpu->kvm); r =3D tdp_mmu_split_huge_page(kvm, &iter, sp, true); } else { - r =3D tdp_mmu_link_sp(kvm, &iter, sp, true); + r =3D tdp_mmu_link_sp(vcpu, &iter, sp, true); } =20 /* @@ -1514,7 +1553,7 @@ static int tdp_mmu_split_huge_page(struct kvm *kvm, s= truct tdp_iter *iter, * correctness standpoint since the translation will be the same either * way. */ - ret =3D tdp_mmu_link_sp(kvm, iter, sp, shared); + ret =3D __tdp_mmu_link_sp(kvm, iter, sp, shared); if (ret) goto out; =20 --=20 2.47.2 From nobody Sun Feb 8 09:26:55 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A6E5C255F35; Fri, 2 May 2025 13:08:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746191327; cv=none; b=McEU6O55S6xbwtmQUpSpvO4zWBxD/8SRBp6qJaHgWqLYituNU5jEksu1rMc+sd7M5GHMawh/GxxB+6tkpT6fe3NyP/QpCALqmN121E73oRvM2S1S+5TqizT/Wu/ICPjY9MjuypRAFMqnMRQwFYOlveQaqBuqgCb/0Eep3vdK08s= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746191327; c=relaxed/simple; bh=I6qbJYYYEbsv9ghw69w99htbaZsoBJusuXa9G8q9fEE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=DXaWDe7wK1Q/lb2PopcpIBYXZTA5wsAJTWScyiLh65JPd9SJeJxChnCmfybS4NpeUhOHNc/wmFcbWUhKFO4Pum1WO3CaGLVxt7baSCV2xWG2xCuiOGJ65ZHBHTq+F75LXQqa/sCSCnRCxUk4LVDnCII2RfOCuYViD4De5MMr5jQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.helo=mgamail.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=VX3hW1dt; arc=none smtp.client-ip=198.175.65.18 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.helo=mgamail.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="VX3hW1dt" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1746191325; x=1777727325; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=I6qbJYYYEbsv9ghw69w99htbaZsoBJusuXa9G8q9fEE=; b=VX3hW1dtyYS2XViTen4Ki/lODtaQLTz24n3LiEFcbvjPio/0EJ4/abVV S+THQnYYyhw66B9PaXtAGLjJQ3NP09C2bh/FnaddRS/B4hbE4pBkWNmo3 UmoeNnmiWAvE634I0OkI0RiA0cT8JpOhM3XK5F1AZq+icAlIMkAl8eZ6j 8kOuZBPDQR57wLYsIr121LUGLtU8ssoIsqxxf2qFIrlwCyZhfOrCZaNHs yHjJjgPC33ygwBthsxmwUlQ++YZv86YvwxBYqmK9auOIwHY26Et/4Woge /EBRGiyriq5zE8M5b+ZqcRm9SppXA+qBUcC+1VmOBpkWLH36Az3HzatfI Q==; X-CSE-ConnectionGUID: nBitgRh1TUCYEiPHVytP9w== X-CSE-MsgGUID: pp1tPVEfQ9WZYLIwxbnKhQ== X-IronPort-AV: E=McAfee;i="6700,10204,11421"; a="48012987" X-IronPort-AV: E=Sophos;i="6.15,256,1739865600"; d="scan'208";a="48012987" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa110.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 May 2025 06:08:45 -0700 X-CSE-ConnectionGUID: IIAkqwwPTfGrRroE/gPC2A== X-CSE-MsgGUID: tzKiF+OjR1SUi0GIX1ZPUQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,256,1739865600"; d="scan'208";a="157871091" Received: from black.fi.intel.com ([10.237.72.28]) by fmviesa002.fm.intel.com with ESMTP; 02 May 2025 06:08:41 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id 7A599366; Fri, 02 May 2025 16:08:36 +0300 (EEST) From: "Kirill A. Shutemov" To: pbonzini@redhat.com, seanjc@google.com Cc: rick.p.edgecombe@intel.com, isaku.yamahata@intel.com, kai.huang@intel.com, yan.y.zhao@intel.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, kvm@vger.kernel.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [RFC, PATCH 09/12] KVM: TDX: Preallocate PAMT pages to be used in page fault path Date: Fri, 2 May 2025 16:08:25 +0300 Message-ID: <20250502130828.4071412-10-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250502130828.4071412-1-kirill.shutemov@linux.intel.com> References: <20250502130828.4071412-1-kirill.shutemov@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Preallocate a page to be used in the link_external_spt() and set_external_spte() paths. In the worst-case scenario, handling a page fault might require a tdx_nr_pamt_pages() pages for each page table level. Signed-off-by: Kirill A. Shutemov --- arch/x86/include/asm/kvm_host.h | 2 ++ arch/x86/kvm/mmu/mmu.c | 10 ++++++++++ 2 files changed, 12 insertions(+) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos= t.h index 91958c55f918..a5661499a176 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -849,6 +849,8 @@ struct kvm_vcpu_arch { */ struct kvm_mmu_memory_cache mmu_external_spt_cache; =20 + struct kvm_mmu_memory_cache pamt_page_cache; + /* * QEMU userspace and the guest each have their own FPU state. * In vcpu_run, we switch between the user and guest FPU contexts. diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index a284dce227a0..7bfa0dc50440 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -616,6 +616,15 @@ static int mmu_topup_memory_caches(struct kvm_vcpu *vc= pu, bool maybe_indirect) if (r) return r; } + + if (vcpu->kvm->arch.vm_type =3D=3D KVM_X86_TDX_VM) { + int nr =3D tdx_nr_pamt_pages(tdx_get_sysinfo()); + r =3D kvm_mmu_topup_memory_cache(&vcpu->arch.pamt_page_cache, + nr * PT64_ROOT_MAX_LEVEL); + if (r) + return r; + } + return kvm_mmu_topup_memory_cache(&vcpu->arch.mmu_page_header_cache, PT64_ROOT_MAX_LEVEL); } @@ -626,6 +635,7 @@ static void mmu_free_memory_caches(struct kvm_vcpu *vcp= u) kvm_mmu_free_memory_cache(&vcpu->arch.mmu_shadow_page_cache); kvm_mmu_free_memory_cache(&vcpu->arch.mmu_shadowed_info_cache); kvm_mmu_free_memory_cache(&vcpu->arch.mmu_external_spt_cache); + kvm_mmu_free_memory_cache(&vcpu->arch.pamt_page_cache); kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_header_cache); } =20 --=20 2.47.2 From nobody Sun Feb 8 09:26:55 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 67CD02566D4; Fri, 2 May 2025 13:08:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746191329; cv=none; b=KwUUHhaq/OJaA0DrC8r0HjC0q3KDYAJIDkpmyXyqDjtpQvbHdULHMzyQIHGr1qNGLIdNFYa8Es1ew3qxYf7BpBdliCVAJC0OHSUVDOWSjE3FQeq3X0qjUU7JeyWbSQvuDBQ5V6WRHgIxbPZucbPzjyLOMtvVcNC5MJWcnuxQMcM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746191329; c=relaxed/simple; bh=m2m5QyspU/4UyDV/0jL1rXptk37jp9pz/ghUzoVVwFo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=QctJOk40ws/iPnxXepfdhmZmdroAWSRMPolnR2N7+47Lt+M6rHyEzSQzhYalXek+ni5cvhTT4Mu9W1jdNE/cKzSOhGcOqCapiG5n/fpn2RGa7uMeB0RvOF+YhgWezZcbIj17iH7b+8I3JbXP7/0gMV3DnDYFzhZl56E6c+Ugp8E= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.helo=mgamail.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=YMNO7feR; arc=none smtp.client-ip=198.175.65.18 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.helo=mgamail.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="YMNO7feR" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1746191327; x=1777727327; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=m2m5QyspU/4UyDV/0jL1rXptk37jp9pz/ghUzoVVwFo=; b=YMNO7feRkltCs9GArrga/LzvpSstpMfJ5ztb+L0XKCc8W/WPVY/h6JgO /hljyOHvpE37zeNjdjrPvbFUrSdpgO3/qhT7+fO/5qWBLceuAjrfge5Id RprSpeBjqwMMDapuB4bD+486OJMJaiQxNxH4fcTzr1AjB98kPBMoaJwDy ulBoxCYZzHHNCRoACdCMqF56IDA3I/bqotC7SfvGl5Ksa5nIT1RFF1Gcw fmmV3NkY7Tuehz4b8HpXjEA+77ZJlpXT7gNuJmNBqT8Qlo6j3iwRZHWEZ t4thfOLLBuEXvXmJ/62lTrZAUUObyNtWr0E0YWeWoL5F5PCLKPj/2eRQ7 Q==; X-CSE-ConnectionGUID: lCF/6zVVS+aoYhqbHdEllg== X-CSE-MsgGUID: JFmFK1pHQqGRJmYVf+Nk3A== X-IronPort-AV: E=McAfee;i="6700,10204,11421"; a="48012989" X-IronPort-AV: E=Sophos;i="6.15,256,1739865600"; d="scan'208";a="48012989" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa110.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 May 2025 06:08:45 -0700 X-CSE-ConnectionGUID: LfwDnBDURhq5OyBvLXjxmw== X-CSE-MsgGUID: zQqhLQYdSiKeYfAeNjyBVw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,256,1739865600"; d="scan'208";a="157871093" Received: from black.fi.intel.com ([10.237.72.28]) by fmviesa002.fm.intel.com with ESMTP; 02 May 2025 06:08:41 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id 86CB636F; Fri, 02 May 2025 16:08:36 +0300 (EEST) From: "Kirill A. Shutemov" To: pbonzini@redhat.com, seanjc@google.com Cc: rick.p.edgecombe@intel.com, isaku.yamahata@intel.com, kai.huang@intel.com, yan.y.zhao@intel.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, kvm@vger.kernel.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [RFC, PATCH 10/12] KVM: TDX: Hookup phys_prepare() and phys_cleanup() kvm_x86_ops Date: Fri, 2 May 2025 16:08:26 +0300 Message-ID: <20250502130828.4071412-11-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250502130828.4071412-1-kirill.shutemov@linux.intel.com> References: <20250502130828.4071412-1-kirill.shutemov@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Allocate PAMT memory from a per-VCPU pool in kvm_x86_ops::phys_prepare() and release memory in kvm_x86_ops::phys_cleanup(). The TDP code invokes these callbacks to handle PAMT memory management. Signed-off-by: Kirill A. Shutemov --- arch/x86/kvm/vmx/main.c | 2 ++ arch/x86/kvm/vmx/tdx.c | 30 ++++++++++++++++++++++++++++++ arch/x86/kvm/vmx/x86_ops.h | 9 +++++++++ virt/kvm/kvm_main.c | 1 + 4 files changed, 42 insertions(+) diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index 94d5d907d37b..665a3dbd4ba5 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -63,6 +63,8 @@ static __init int vt_hardware_setup(void) vt_x86_ops.free_external_spt =3D tdx_sept_free_private_spt; vt_x86_ops.remove_external_spte =3D tdx_sept_remove_private_spte; vt_x86_ops.protected_apic_has_interrupt =3D tdx_protected_apic_has_inter= rupt; + vt_x86_ops.phys_prepare =3D tdx_phys_prepare; + vt_x86_ops.phys_cleanup =3D tdx_phys_cleanup; } =20 return 0; diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 18c4ae00cd8d..0f06ae7ff6b9 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -1958,6 +1958,36 @@ int tdx_sept_remove_private_spte(struct kvm *kvm, gf= n_t gfn, return tdx_sept_drop_private_spte(kvm, gfn, level, page); } =20 +int tdx_phys_prepare(struct kvm_vcpu *vcpu, kvm_pfn_t pfn) +{ + unsigned long hpa =3D pfn << PAGE_SHIFT; + atomic_t *pamt_refcount; + LIST_HEAD(pamt_pages); + + if (!tdx_supports_dynamic_pamt(tdx_sysinfo)) + return 0; + + pamt_refcount =3D tdx_get_pamt_refcount(hpa); + if (atomic_inc_not_zero(pamt_refcount)) + return 0; + + for (int i =3D 0; i < tdx_nr_pamt_pages(tdx_sysinfo); i++) { + struct page *page; + void *p; + + p =3D kvm_mmu_memory_cache_alloc(&vcpu->arch.pamt_page_cache); + page =3D virt_to_page(p); + list_add(&page->lru, &pamt_pages); + } + + return tdx_pamt_add(pamt_refcount, hpa, &pamt_pages); +} + +void tdx_phys_cleanup(kvm_pfn_t pfn) +{ + tdx_pamt_put(pfn_to_page(pfn)); +} + void tdx_deliver_interrupt(struct kvm_lapic *apic, int delivery_mode, int trig_mode, int vector) { diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h index 6bf8be570b2e..111f16c3039f 100644 --- a/arch/x86/kvm/vmx/x86_ops.h +++ b/arch/x86/kvm/vmx/x86_ops.h @@ -158,6 +158,8 @@ int tdx_sept_set_private_spte(struct kvm *kvm, gfn_t gf= n, enum pg_level level, kvm_pfn_t pfn); int tdx_sept_remove_private_spte(struct kvm *kvm, gfn_t gfn, enum pg_level level, kvm_pfn_t pfn); +int tdx_phys_prepare(struct kvm_vcpu *vcpu, kvm_pfn_t pfn); +void tdx_phys_cleanup(kvm_pfn_t pfn); =20 void tdx_flush_tlb_current(struct kvm_vcpu *vcpu); void tdx_flush_tlb_all(struct kvm_vcpu *vcpu); @@ -224,6 +226,13 @@ static inline int tdx_sept_remove_private_spte(struct = kvm *kvm, gfn_t gfn, return -EOPNOTSUPP; } =20 +static inline int tdx_phys_prepare(struct kvm_vcpu *vcpu, kvm_pfn_t pfn) +{ + return -EOPNOTSUPP; +} + +static inline void tdx_phys_cleanup(kvm_pfn_t pfn) {} + static inline void tdx_flush_tlb_current(struct kvm_vcpu *vcpu) {} static inline void tdx_flush_tlb_all(struct kvm_vcpu *vcpu) {} static inline void tdx_load_mmu_pgd(struct kvm_vcpu *vcpu, hpa_t root_hpa,= int root_level) {} diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 69782df3617f..c3ba3ca37940 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -436,6 +436,7 @@ void *kvm_mmu_memory_cache_alloc(struct kvm_mmu_memory_= cache *mc) BUG_ON(!p); return p; } +EXPORT_SYMBOL_GPL(kvm_mmu_memory_cache_alloc); #endif =20 static void kvm_vcpu_init(struct kvm_vcpu *vcpu, struct kvm *kvm, unsigned= id) --=20 2.47.2 From nobody Sun Feb 8 09:26:55 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1ED0D255E37; Fri, 2 May 2025 13:08:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746191326; cv=none; b=Ay+t/lNyW2L9nZFxxYOCn1Cq5avkS7Z0zQvsLoppmu3MeLMaznMY6X1pUyTyjLqjZJLWeObv6xCqgW0S+1jb0d8TVTf87AYWDGC47XH2eq68zIIn/kmhW+9iQgzRL0OLUz70qGF0ahpYvNewIcEfL/K6Ib84kwvEPJnR3GezMC0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746191326; c=relaxed/simple; bh=5YulLqzYaPRFiDwv0UiD8Q1U8nJJ4tt2rc8wqBnQCTA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=IZYiXJWdyiNeCa5W9MjUHsE6rKFphrj3DlnyUvXZDBekqebGgHsw6vgTaWAg+xU74dRRVCLftcsPVD5wVyI1CQ1GUdle6nchVeoPJROA/Xs7zDB2kNIPd8dy/76b+02YLAWytyGZGpIt63Q/vi5evun3M/Qjgnr1mMlppRUN0jU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.helo=mgamail.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=bNDvKZ0D; arc=none smtp.client-ip=198.175.65.18 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.helo=mgamail.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="bNDvKZ0D" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1746191325; x=1777727325; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=5YulLqzYaPRFiDwv0UiD8Q1U8nJJ4tt2rc8wqBnQCTA=; b=bNDvKZ0DMHP8yN1tf2GXWQiWprFvXYDnqj5DVy58LfFGJNl19/JpdnHF HYviKuvSmHUPzMMbGLxMnlzqFCl6ORkZ7QJuFvCpXHyW4I9dDFdxADi2Z +kLvCgJqVU3gXbtB9bjMYvgnqMtCkXpPN/7T8bJ0DaC+Sl1zEsWNq09dy nLZr0qr2gLtBopLuhJKvXoNZ7QTiEIduDO9IxKbp1zOVP0fTQE0LMeIC+ PDENXgSzVVIcL+FirakUrkqj+lyajRz57CsVxF/i2vHsCHnuJ6zE64JI/ ZcjzHJ0qHzzsL5NBLnOKkvx3SOveA5BXis0v0DQZiVbSAAten4cyrphl6 Q==; X-CSE-ConnectionGUID: WzwgLAyYS4O/cIiS8SAQ6w== X-CSE-MsgGUID: jQAMGxmZQPy4mwfWgQAd5A== X-IronPort-AV: E=McAfee;i="6700,10204,11421"; a="48012973" X-IronPort-AV: E=Sophos;i="6.15,256,1739865600"; d="scan'208";a="48012973" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa110.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 May 2025 06:08:44 -0700 X-CSE-ConnectionGUID: sdojZYYiQNKO+f/l5nWG9Q== X-CSE-MsgGUID: IfHdJ5QqSX2fqonxzhahQQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,256,1739865600"; d="scan'208";a="157871082" Received: from black.fi.intel.com ([10.237.72.28]) by fmviesa002.fm.intel.com with ESMTP; 02 May 2025 06:08:41 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id 8F02D3EA; Fri, 02 May 2025 16:08:36 +0300 (EEST) From: "Kirill A. Shutemov" To: pbonzini@redhat.com, seanjc@google.com Cc: rick.p.edgecombe@intel.com, isaku.yamahata@intel.com, kai.huang@intel.com, yan.y.zhao@intel.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, kvm@vger.kernel.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [RFC, PATCH 11/12] KVM: TDX: Reclaim PAMT memory Date: Fri, 2 May 2025 16:08:27 +0300 Message-ID: <20250502130828.4071412-12-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250502130828.4071412-1-kirill.shutemov@linux.intel.com> References: <20250502130828.4071412-1-kirill.shutemov@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The PAMT memory holds metadata for TDX-protected memory. With Dynamic PAMT, PAMT_4K is allocated on demand. The kernel supplies the TDX module with a few pages that cover 2M of host physical memory. PAMT memory can be reclaimed when the last user is gone. It can happen in a few code paths: - On TDH.PHYMEM.PAGE.RECLAIM in tdx_reclaim_td_control_pages() and tdx_reclaim_page(). - On TDH.MEM.PAGE.REMOVE in tdx_sept_drop_private_spte(). - In tdx_sept_zap_private_spte() for pages that were in the queue to be added with TDH.MEM.PAGE.ADD, but it never happened due to an error. Add tdx_pamt_put() in these code paths. Signed-off-by: Kirill A. Shutemov --- arch/x86/kvm/vmx/tdx.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 0f06ae7ff6b9..352f7b41f611 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -487,8 +487,11 @@ static int tdx_reclaim_page(struct page *page) int r; =20 r =3D __tdx_reclaim_page(page); - if (!r) + if (!r) { tdx_clear_page(page); + tdx_pamt_put(page); + } + return r; } =20 @@ -737,6 +740,7 @@ static void tdx_reclaim_td_control_pages(struct kvm *kv= m) return; } tdx_clear_page(kvm_tdx->td.tdr_page); + tdx_pamt_put(kvm_tdx->td.tdr_page); =20 __free_page(kvm_tdx->td.tdr_page); kvm_tdx->td.tdr_page =3D NULL; @@ -1768,6 +1772,7 @@ static int tdx_sept_drop_private_spte(struct kvm *kvm= , gfn_t gfn, return -EIO; } tdx_clear_page(page); + tdx_pamt_put(page); tdx_unpin(kvm, page); return 0; } @@ -1848,6 +1853,7 @@ static int tdx_sept_zap_private_spte(struct kvm *kvm,= gfn_t gfn, if (tdx_is_sept_zap_err_due_to_premap(kvm_tdx, err, entry, level) && !KVM_BUG_ON(!atomic64_read(&kvm_tdx->nr_premapped), kvm)) { atomic64_dec(&kvm_tdx->nr_premapped); + tdx_pamt_put(page); tdx_unpin(kvm, page); return 0; } --=20 2.47.2 From nobody Sun Feb 8 09:26:55 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D87052561BB; Fri, 2 May 2025 13:08:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746191328; cv=none; b=LD+TJox/UbxAfFmC/A6FE2cjKyh0HMxtNIAMsZWsUCo4BcKqcFrXAAnCaMeV/ykgjDD4+h4w7KS16RJjqNUKMM/pc0ADq2cE6N79faTcMR1aexXmWbaZHbCGD8qR0BHGc2wu0JMsi6AacBbQGJquWdytgMEkaJRHyYxMIWQE+lw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746191328; c=relaxed/simple; bh=7xeWckPjP89IBi9NXNh7YPusMMIHW2rLxxKdSrf9EoQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=feqw29DVTLx6ReqhrfAQlON9QhVNWdtBnqcvStKFPpC0+pAWYk+6pGmOs0OX7KClqwi82ZXDZUgVWZfnhw3nbzNF+VBcEe54yNb6AdnfjW61AXFfwEh6Ja5bN0AMRgv5x3z1lFCm5cSzjjATdBYobNCtSRgKNbfXq0pjK0J0OnQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.helo=mgamail.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=CCm1wqj3; arc=none smtp.client-ip=198.175.65.18 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.helo=mgamail.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="CCm1wqj3" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1746191326; x=1777727326; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=7xeWckPjP89IBi9NXNh7YPusMMIHW2rLxxKdSrf9EoQ=; b=CCm1wqj3f+ZaTja4b2M1+ofWaz/G7JWxmP+2sU/U66oeK7yUxBq6CcFs rJoXD1qxQZy22dgTQGhd00ThVk9MMU4cxSDS+as/aEQDE/ey8JIqVIirJ I3hAErPNBtFs2xiOcFoY5zsvEO+dyJQM9dEmN3uuXeup19o4WEqN4zyWL 77PqLB+wdB4pYjnb3q8Tk3xR0f8lZblmdDPISMku0eJ+wttLVk3ecs7DP syJuv0UAGtrG53pbHuxHBbxytKdM5Il1nv/DHacnBtQVvB/92BQSoNuJ6 sXKuwWGnaRiZEsd1hP/kO3JSB8c7qWUO/aiSZkKZJW06v6qoHg6gnRXyN g==; X-CSE-ConnectionGUID: gAWWfsLgThWVnxYBahDYgw== X-CSE-MsgGUID: 8J8R3T25QRKWuZbMMgR3hg== X-IronPort-AV: E=McAfee;i="6700,10204,11421"; a="48013001" X-IronPort-AV: E=Sophos;i="6.15,256,1739865600"; d="scan'208";a="48013001" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa110.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 May 2025 06:08:45 -0700 X-CSE-ConnectionGUID: p2hxTxWpReWIJnPWa4/6yQ== X-CSE-MsgGUID: XeibqyAdQrmPjFUwZsSSIw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,256,1739865600"; d="scan'208";a="157871094" Received: from black.fi.intel.com ([10.237.72.28]) by fmviesa002.fm.intel.com with ESMTP; 02 May 2025 06:08:41 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id 97A1D436; Fri, 02 May 2025 16:08:36 +0300 (EEST) From: "Kirill A. Shutemov" To: pbonzini@redhat.com, seanjc@google.com Cc: rick.p.edgecombe@intel.com, isaku.yamahata@intel.com, kai.huang@intel.com, yan.y.zhao@intel.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, kvm@vger.kernel.org, x86@kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [RFC, PATCH 12/12] x86/virt/tdx: Enable Dynamic PAMT Date: Fri, 2 May 2025 16:08:28 +0300 Message-ID: <20250502130828.4071412-13-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250502130828.4071412-1-kirill.shutemov@linux.intel.com> References: <20250502130828.4071412-1-kirill.shutemov@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The Physical Address Metadata Table (PAMT) holds TDX metadata for physical memory and must be allocated by the kernel during TDX module initialization. The exact size of the required PAMT memory is determined by the TDX module and may vary between TDX module versions, but currently it is approximately 0.4% of the system memory. This is a significant commitment, especially if it is not known upfront whether the machine will run any TDX guests. The Dynamic PAMT feature reduces static PAMT allocations. PAMT_1G and PAMT_2M levels are still allocated on TDX module initialization, but the PAMT_4K level is allocated dynamically, reducing static allocations to approximately 0.004% of the system memory. All pieces are in place. Enable Dynamic PAMT if it is supported. Signed-off-by: Kirill A. Shutemov --- arch/x86/include/asm/tdx.h | 6 +++++- arch/x86/virt/vmx/tdx/tdx.c | 8 ++++++++ arch/x86/virt/vmx/tdx/tdx.h | 3 --- 3 files changed, 13 insertions(+), 4 deletions(-) diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h index 42449c054938..5744f98d193e 100644 --- a/arch/x86/include/asm/tdx.h +++ b/arch/x86/include/asm/tdx.h @@ -32,6 +32,10 @@ #define TDX_SUCCESS 0ULL #define TDX_RND_NO_ENTROPY 0x8000020300000000ULL =20 +/* Bit definitions of TDX_FEATURES0 metadata field */ +#define TDX_FEATURES0_NO_RBP_MOD BIT_ULL(18) +#define TDX_FEATURES0_DYNAMIC_PAMT BIT_ULL(36) + #ifndef __ASSEMBLER__ =20 #include @@ -127,7 +131,7 @@ const struct tdx_sys_info *tdx_get_sysinfo(void); =20 static inline bool tdx_supports_dynamic_pamt(const struct tdx_sys_info *sy= sinfo) { - return false; /* To be enabled when kernel is ready */ + return sysinfo->features.tdx_features0 & TDX_FEATURES0_DYNAMIC_PAMT; } =20 static inline int tdx_nr_pamt_pages(const struct tdx_sys_info *sysinfo) diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c index 74bd81acef7b..f35566c0588d 100644 --- a/arch/x86/virt/vmx/tdx/tdx.c +++ b/arch/x86/virt/vmx/tdx/tdx.c @@ -945,6 +945,8 @@ static int construct_tdmrs(struct list_head *tmb_list, return ret; } =20 +#define TDX_SYS_CONFIG_DYNAMIC_PAMT BIT(16) + static int config_tdx_module(struct tdmr_info_list *tdmr_list, u64 global_= keyid) { struct tdx_module_args args =3D {}; @@ -972,6 +974,12 @@ static int config_tdx_module(struct tdmr_info_list *td= mr_list, u64 global_keyid) args.rcx =3D __pa(tdmr_pa_array); args.rdx =3D tdmr_list->nr_consumed_tdmrs; args.r8 =3D global_keyid; + + if (tdx_supports_dynamic_pamt(&tdx_sysinfo)) { + pr_info("Enable Dynamic PAMT\n"); + args.r8 |=3D TDX_SYS_CONFIG_DYNAMIC_PAMT; + } + ret =3D seamcall_prerr(TDH_SYS_CONFIG, &args); =20 /* Free the array as it is not required anymore. */ diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h index 46c4214b79fb..096c78a1d438 100644 --- a/arch/x86/virt/vmx/tdx/tdx.h +++ b/arch/x86/virt/vmx/tdx/tdx.h @@ -86,9 +86,6 @@ struct tdmr_info { DECLARE_FLEX_ARRAY(struct tdmr_reserved_area, reserved_areas); } __packed __aligned(TDMR_INFO_ALIGNMENT); =20 -/* Bit definitions of TDX_FEATURES0 metadata field */ -#define TDX_FEATURES0_NO_RBP_MOD BIT(18) - /* * Do not put any hardware-defined TDX structure representations below * this comment! --=20 2.47.2