From nobody Mon Jun 8 09:49:34 2026 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.8]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DD0C53890E2 for ; Thu, 4 Jun 2026 09:36:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.8 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780565780; cv=none; b=XVY3+QFRL2FgyRZ6RFR72vLTdBfSxDKGbe1sCApo6Z4eqsjd9LKNCDLuYvpo9eA/tla76jyHaBketb4a03dFuv3tX5eKOrBq3nZqaeyEs314z7SwC/GmYrtNX0q/my2/YEBsjUconb2dZund2sVu3ghwhnciX+E1z/xnGdPh0iU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780565780; c=relaxed/simple; bh=U0zddfQbL5lPFbIImB5VeCsmymbumc8vHA4oMmdAu9Q=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Nws/YzVcau/vTwnOfCN1KmaTI1cA4sv2tlqpvyzdrUy2KIreUUBiqjumNLcQgN5Uk3fbndFkFqUTkA2SkRhMRLTxxWiYvCcNbJ4Q94qLhBmh/uSrbW6kYjOKwmP4JFEgBqQ4FXgY4YtQrbsf24jOLcL9ZUJJg5huNuGZzU+iOFc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=KRTT0KHx; arc=none smtp.client-ip=192.198.163.8 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="KRTT0KHx" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1780565779; x=1812101779; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=U0zddfQbL5lPFbIImB5VeCsmymbumc8vHA4oMmdAu9Q=; b=KRTT0KHxWy5A07wrtHXftcWdwqQq9c+JbGtVMrBO3M2eBlxRPiS+Nb5G TmyOFIjtGoLr2ytIsr9T0AMCr816A7Dqz+/G7EUiDJQ809O4DMhK353A3 g2U72adIk4+G//IWjF+5axYdppzhMxTB2tr4u6Fj80igoK1025F6FrZ+e xp5XWXD31M3O3ZfGZK0Q/bWAfl+inwpDHUPSGg+zNo9lJYIS3SEBX+2Yy FOQkykSR86hsAoITHBGBIUrQ8rljB7xiuFt2pxHJz14Mr9iHICJQqTkLZ xkkCvZOyd+FZs5V2BfJTTk+088N0GjLnaAJY/p8qrB86JC6ysPa5KUcI5 Q==; X-CSE-ConnectionGUID: XMyXMBArSv6RayVI37Snsw== X-CSE-MsgGUID: PL9yI6bVQQaU5Id462iFLg== X-IronPort-AV: E=McAfee;i="6800,10657,11806"; a="98963709" X-IronPort-AV: E=Sophos;i="6.24,186,1774335600"; d="scan'208";a="98963709" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by fmvoesa102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Jun 2026 02:36:19 -0700 X-CSE-ConnectionGUID: gTss0eFSQvmmPTlZJqiR5g== X-CSE-MsgGUID: n9Wav+34ShinUNsUtmzqmA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.24,186,1774335600"; d="scan'208";a="240037562" Received: from unknown (HELO gnr-sp-2s-612.sh.intel.com) ([10.112.230.229]) by fmviesa006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Jun 2026 02:36:14 -0700 From: Zhenzhong Duan To: marcandre.lureau@redhat.com, david@kernel.org, kas@kernel.org, rick.p.edgecombe@intel.com, prsampat@amd.com, pbonzini@redhat.com, mst@redhat.com, peterx@redhat.com, chenyi.qiang@intel.com, elena.reshetova@intel.com, michaeluth@amd.com, ackerleytng@google.com Cc: linux-kernel@vger.kernel.org, linux-coco@lists.linux.dev, virtualization@lists.linux.dev, x86@kernel.org, yilun.xu@intel.com, xiaoyao.li@intel.com, chao.p.peng@intel.com Subject: [RFC PATCH 1/6] mm/memory_hotplug: Add memory post-plug callback infrastructure Date: Thu, 4 Jun 2026 05:35:46 -0400 Message-ID: <20260604093551.1511079-2-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.52.0 In-Reply-To: <20260604093551.1511079-1-zhenzhong.duan@intel.com> References: <20260604093551.1511079-1-zhenzhong.duan@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" In confidential computing environments like TDX, newly added memory must be explicitly "accepted" by the guest before it can be safely accessed. When virtio-mem or other memory hotplug drivers add memory to a TDX guest, the memory pages are initially in an "unaccepted" state. Accessing unaccepted memory triggers VM exits and can cause guest crashes. The guest must call TDX hypercalls to accept each page before use. This callback infrastructure allows the TDX guest code to register a handler that will be invoked after memory is plugged, ensuring all newly added memory is properly accepted before being made available to the kernel's memory management subsystem. Signed-off-by: Zhenzhong Duan --- include/linux/memory_hotplug.h | 11 +++++++++++ mm/memory_hotplug.c | 20 ++++++++++++++++++++ 2 files changed, 31 insertions(+) diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h index 815e908c4135..39f0a35a5112 100644 --- a/include/linux/memory_hotplug.h +++ b/include/linux/memory_hotplug.h @@ -28,6 +28,8 @@ enum mmop { MMOP_ONLINE_MOVABLE, }; =20 +typedef int (*memory_post_plug_callback_t)(u64 addr, u64 size); + #ifdef CONFIG_MEMORY_HOTPLUG struct page *pfn_to_online_page(unsigned long pfn); =20 @@ -176,6 +178,9 @@ static inline void pgdat_kswapd_lock_init(pg_data_t *pg= dat) mutex_init(&pgdat->kswapd_lock); } =20 +void set_memory_post_plug_callback(memory_post_plug_callback_t callback); +int memory_post_plug_call(u64 addr, u64 size); + #else /* ! CONFIG_MEMORY_HOTPLUG */ #define pfn_to_online_page(pfn) \ ({ \ @@ -221,6 +226,12 @@ static inline bool mhp_supports_memmap_on_memory(void) static inline void pgdat_kswapd_lock(pg_data_t *pgdat) {} static inline void pgdat_kswapd_unlock(pg_data_t *pgdat) {} static inline void pgdat_kswapd_lock_init(pg_data_t *pgdat) {} + +static inline void set_memory_post_plug_callback(memory_post_plug_callback= _t callback) {} +static inline int memory_post_plug_call(u64 addr, u64 size) +{ + return 0; +} #endif /* ! CONFIG_MEMORY_HOTPLUG */ =20 /* diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 40c7915dabe0..73054ed016fd 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1729,6 +1729,26 @@ bool mhp_range_allowed(u64 start, u64 size, bool nee= d_mapping) return false; } =20 +static memory_post_plug_callback_t memory_post_plug_callback __ro_after_in= it; + +void set_memory_post_plug_callback(memory_post_plug_callback_t callback) +{ + /* Fatal error to set callback twice in boot stage */ + if (memory_post_plug_callback) + panic("memory_post_plug_callback is already registered\n"); + + memory_post_plug_callback =3D callback; +} + +int memory_post_plug_call(u64 addr, u64 size) +{ + if (!memory_post_plug_callback) + return 0; + + return (*memory_post_plug_callback)(addr, size); +} +EXPORT_SYMBOL_GPL(memory_post_plug_call); + #ifdef CONFIG_MEMORY_HOTREMOVE /* * Scan pfn range [start,end) to find movable/migratable pages (LRU and --=20 2.52.0 From nobody Mon Jun 8 09:49:34 2026 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.8]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 383C0383326 for ; Thu, 4 Jun 2026 09:36:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.8 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780565784; cv=none; b=cJ1cG62CHPUW9L4PKL8czNseuqTwKPRUVXIMcYXDh8WI6bgMcTCPM1P8T3Upmhrxn457Dso62wU2fUIbzPNP3X6g7F19d9Qz1LlrEbp7hZL5h80IubfHfozO+UfdO+ZydtvVQe/sGFny9gBSaLc0NE2SL6xdHfncT7rENB3nAO0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780565784; c=relaxed/simple; bh=AmkEnJafLBDe7NhPZmcM6s+NU/+tS7FpzAl5wSSvklU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Rz0iX0FOpsxo7TXUBVgINiwNSa4gIeakEpnHd6Izt0ZISsLDd9eJuLu0vZQZAITM50vN+4CVkPaj9TPkJcgV/+cYZPyZhRslhj9QkzzJOFw2435fzezTBgYKLLsAn7DfWWMWYWKYsdy1AAfMYRZiKE8qW5cA5LYheR61kiK4DcA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=f4uy6Pcb; arc=none smtp.client-ip=192.198.163.8 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="f4uy6Pcb" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1780565783; x=1812101783; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=AmkEnJafLBDe7NhPZmcM6s+NU/+tS7FpzAl5wSSvklU=; b=f4uy6Pcb+35KC2faegecsvIzRVpm2p07AQB01WDd/NEdEnZSQWP/AK5k +dFjgt/Vzen0NOGbWTADZmeIPmyuWww2albyqNTRB4KOr/1wATi9abP5M gYomYbI7K7N1qf+pLHFZNeYPCLePlS9Fx5J6ZOUx0CHDvOResw5PQnypa GumICs4DeCEKlSTUAvDBZH9sgOrLRksGeCYtEzLySOPoOdp5HJwLfJCbn 19O/rHiYxnP1ipMS1Fy48ddivRUljuGb9Id4/66KcmUBRVP2G7GuXuiT7 +YyctgWPxofwMNGMEwTHPTr+cyVQnF/VD+JiFRQCY/T5x5Gm58DX46tER g==; X-CSE-ConnectionGUID: tzkewZrMQFSQur8/e0CrXw== X-CSE-MsgGUID: UyOpoKmNSjGn24SwSx50/g== X-IronPort-AV: E=McAfee;i="6800,10657,11806"; a="98963716" X-IronPort-AV: E=Sophos;i="6.24,186,1774335600"; d="scan'208";a="98963716" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by fmvoesa102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Jun 2026 02:36:23 -0700 X-CSE-ConnectionGUID: 9RSrr9LcTs+8yn8RsiMjog== X-CSE-MsgGUID: a31rwg1MTxmrHxaq9ISRpw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.24,186,1774335600"; d="scan'208";a="240037587" Received: from unknown (HELO gnr-sp-2s-612.sh.intel.com) ([10.112.230.229]) by fmviesa006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Jun 2026 02:36:19 -0700 From: Zhenzhong Duan To: marcandre.lureau@redhat.com, david@kernel.org, kas@kernel.org, rick.p.edgecombe@intel.com, prsampat@amd.com, pbonzini@redhat.com, mst@redhat.com, peterx@redhat.com, chenyi.qiang@intel.com, elena.reshetova@intel.com, michaeluth@amd.com, ackerleytng@google.com Cc: linux-kernel@vger.kernel.org, linux-coco@lists.linux.dev, virtualization@lists.linux.dev, x86@kernel.org, yilun.xu@intel.com, xiaoyao.li@intel.com, chao.p.peng@intel.com Subject: [RFC PATCH 2/6] mm/memory_hotplug: Add memory pre-unplug callback infrastructure Date: Thu, 4 Jun 2026 05:35:47 -0400 Message-ID: <20260604093551.1511079-3-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.52.0 In-Reply-To: <20260604093551.1511079-1-zhenzhong.duan@intel.com> References: <20260604093551.1511079-1-zhenzhong.duan@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" In confidential computing environments like TDX, memory that was previously accepted by the guest could be explicitly "released" back to the hypervisor before it is unplugged, because hypervisor can do no-op for the unplug operation without guest awares, then replug will fail with re-accept error. This callback infrastructure allows the TDX guest code to register a handler that will be invoked after kernel removes memory from its memory management subsystem but before it is unplugged, ensuring all memory pages are properly released via TDG.MEM.PAGE.RELEASE TDCALL. Then re-plug triggers TDG.MEM.PAGE.ACCEPT on pages in "unaccepted" state and succeed. Signed-off-by: Zhenzhong Duan --- include/linux/memory_hotplug.h | 10 ++++++++++ mm/memory_hotplug.c | 20 ++++++++++++++++++++ 2 files changed, 30 insertions(+) diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h index 39f0a35a5112..5bb77670b6cf 100644 --- a/include/linux/memory_hotplug.h +++ b/include/linux/memory_hotplug.h @@ -29,6 +29,7 @@ enum mmop { }; =20 typedef int (*memory_post_plug_callback_t)(u64 addr, u64 size); +typedef int (*memory_pre_unplug_callback_t)(u64 addr, u64 size); =20 #ifdef CONFIG_MEMORY_HOTPLUG struct page *pfn_to_online_page(unsigned long pfn); @@ -278,6 +279,9 @@ extern int remove_memory(u64 start, u64 size); extern void __remove_memory(u64 start, u64 size); extern int offline_and_remove_memory(u64 start, u64 size); =20 +void set_memory_pre_unplug_callback(memory_pre_unplug_callback_t callback); +int memory_pre_unplug_call(u64 addr, u64 size); + #else static inline void try_offline_node(int nid) {} =20 @@ -293,6 +297,12 @@ static inline int remove_memory(u64 start, u64 size) } =20 static inline void __remove_memory(u64 start, u64 size) {} + +static inline void set_memory_pre_unplug_callback(memory_pre_unplug_callba= ck_t callback) {} +static inline int memory_pre_unplug_call(u64 addr, u64 size) +{ + return 0; +} #endif /* CONFIG_MEMORY_HOTREMOVE */ =20 #ifdef CONFIG_MEMORY_HOTPLUG diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 73054ed016fd..fcb6f85c40d0 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -2451,4 +2451,24 @@ int offline_and_remove_memory(u64 start, u64 size) return rc; } EXPORT_SYMBOL_GPL(offline_and_remove_memory); + +static memory_pre_unplug_callback_t memory_pre_unplug_callback __ro_after_= init; + +void set_memory_pre_unplug_callback(memory_pre_unplug_callback_t callback) +{ + /* Fatal error to set callback twice in boot stage */ + if (memory_pre_unplug_callback) + panic("memory_pre_unplug_callback is already registered\n"); + + memory_pre_unplug_callback =3D callback; +} + +int memory_pre_unplug_call(u64 addr, u64 size) +{ + if (!memory_pre_unplug_callback) + return 0; + + return (*memory_pre_unplug_callback)(addr, size); +} +EXPORT_SYMBOL_GPL(memory_pre_unplug_call); #endif /* CONFIG_MEMORY_HOTREMOVE */ --=20 2.52.0 From nobody Mon Jun 8 09:49:34 2026 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.8]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9500B395AF3 for ; Thu, 4 Jun 2026 09:36:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.8 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780565789; cv=none; b=oSHDPc4hVJmHbXZQBHhfsKNEKYHfQzXrwzNeKuLX0Q0pioD3pxBq42psk0fbNO+K4KBzxx7Oln4akvm7w7suOhDgDyAPBwjhs96+NQZIhWNPCNE9zUnmCgoJgYSmgiaqEUULML16BzF05j9gb7HnwOqAC7kOh7+m7mMeJ3FEZsA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780565789; c=relaxed/simple; bh=EQyMo+ctWfI8wcAt+vTvRpCS23Lw4PpWu6nq6X8E2wU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=CUd4tgzsMkCCSMdomVZ/mfn8Bru3UpuUB3b7tQg1eVEkhdewVY8MHTIKNrK6RKO9cMZZeeEadVj/eT5QqbsPWxBjIbAzXoir9aEjwSom3yr6OwNerrXuZEScfx/lW3ZpVqkYbV6XR3qw6H52oW4tQH2Q74B5NxvSrm58NN6laXU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=i1LsVLjA; arc=none smtp.client-ip=192.198.163.8 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="i1LsVLjA" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1780565788; x=1812101788; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=EQyMo+ctWfI8wcAt+vTvRpCS23Lw4PpWu6nq6X8E2wU=; b=i1LsVLjAzx/BY20bWzY+g0STAcCjPdy4NsWsrFNDTBpL4nwfZuIjgwbE 7RuTDCRGaHDxkQsprCf3TB2yvMPyPBVw1cpuBDhCGD5NikgYn26HyGwjo c8BPKvlObpeI9sVK/XmK9l4diBoyEf8BUUyHs/UpDEto2wx1MBD40usbz /z6ilWACMXxvytD5Ur1oQ+yOgmqmR3x198SGGeMXyhdW1zaaqgkLW67tR TCZA9V+F5Jprk3Ig1rg1Wo55H2SxeRdIYD29KqUJ2EasALul7eAEUrN5D 8Qet2NbxywCHfmchPFgev9/qbFgOpb4e24sAS+zCAsrGq0C0hjt1sjG6Q Q==; X-CSE-ConnectionGUID: MF2FJ9LsS9uyn/r1QtyrzQ== X-CSE-MsgGUID: iixKIajZRX2dB95EICgZPA== X-IronPort-AV: E=McAfee;i="6800,10657,11806"; a="98963740" X-IronPort-AV: E=Sophos;i="6.24,186,1774335600"; d="scan'208";a="98963740" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by fmvoesa102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Jun 2026 02:36:27 -0700 X-CSE-ConnectionGUID: IQccCbP4S/6e+evU1fSaoQ== X-CSE-MsgGUID: l9Z5P2TDSpWwY9CAeAr8sw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.24,186,1774335600"; d="scan'208";a="240037624" Received: from unknown (HELO gnr-sp-2s-612.sh.intel.com) ([10.112.230.229]) by fmviesa006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Jun 2026 02:36:23 -0700 From: Zhenzhong Duan To: marcandre.lureau@redhat.com, david@kernel.org, kas@kernel.org, rick.p.edgecombe@intel.com, prsampat@amd.com, pbonzini@redhat.com, mst@redhat.com, peterx@redhat.com, chenyi.qiang@intel.com, elena.reshetova@intel.com, michaeluth@amd.com, ackerleytng@google.com Cc: linux-kernel@vger.kernel.org, linux-coco@lists.linux.dev, virtualization@lists.linux.dev, x86@kernel.org, yilun.xu@intel.com, xiaoyao.li@intel.com, chao.p.peng@intel.com Subject: [RFC PATCH 3/6] virtio-mem: Integrate memory acceptance and release callbacks Date: Thu, 4 Jun 2026 05:35:48 -0400 Message-ID: <20260604093551.1511079-4-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.52.0 In-Reply-To: <20260604093551.1511079-1-zhenzhong.duan@intel.com> References: <20260604093551.1511079-1-zhenzhong.duan@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Integrate the memory post-plug and pre-unplug callbacks into virtio-mem's plug and unplug operations to support TDX memory acceptance and release. For memory plugging, call the post-plug callback after successfully requesting memory from the hypervisor to ensure newly added memory is accepted by TDX guests. If acceptance fails, return -EINVAL to mark the device as broken rather than attempting rollback, since unplug operations may also fail and partial acceptance creates difficult-to-recover state. For memory unplugging, call the pre-unplug callback before requesting memory removal from the hypervisor to allow TDX guests to release memory pages. If release fails, return -EINVAL to mark the device as broken. If the hypervisor unplug request fails after successful memory release, attempt to re-accept the memory to restore consistent state for retry. If re-acceptance fails, mark the device as broken to prevent corruption. The config_changed check is moved to the wrapper functions to ensure callbacks are not invoked unnecessarily when operations will be retried. This integration ensures proper memory lifecycle management in confidential computing environments while maintaining backward compatibility with non-TDX systems where the callbacks are no-ops. Signed-off-by: Zhenzhong Duan --- drivers/virtio/virtio_mem.c | 80 ++++++++++++++++++++++++++++++++----- 1 file changed, 70 insertions(+), 10 deletions(-) diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c index 48051e9e98ab..12b8229dab0d 100644 --- a/drivers/virtio/virtio_mem.c +++ b/drivers/virtio/virtio_mem.c @@ -1416,8 +1416,8 @@ static uint64_t virtio_mem_send_request(struct virtio= _mem *vm, return virtio16_to_cpu(vm->vdev, vm->resp.type); } =20 -static int virtio_mem_send_plug_request(struct virtio_mem *vm, uint64_t ad= dr, - uint64_t size) +static int _virtio_mem_send_plug_request(struct virtio_mem *vm, uint64_t a= ddr, + uint64_t size) { const uint64_t nb_vm_blocks =3D size / vm->device_block_size; const struct virtio_mem_req req =3D { @@ -1427,9 +1427,6 @@ static int virtio_mem_send_plug_request(struct virtio= _mem *vm, uint64_t addr, }; int rc =3D -ENOMEM; =20 - if (atomic_read(&vm->config_changed)) - return -EAGAIN; - dev_dbg(&vm->vdev->dev, "plugging memory: 0x%llx - 0x%llx\n", addr, addr + size - 1); =20 @@ -1454,8 +1451,8 @@ static int virtio_mem_send_plug_request(struct virtio= _mem *vm, uint64_t addr, return rc; } =20 -static int virtio_mem_send_unplug_request(struct virtio_mem *vm, uint64_t = addr, - uint64_t size) +static int _virtio_mem_send_unplug_request(struct virtio_mem *vm, uint64_t= addr, + uint64_t size) { const uint64_t nb_vm_blocks =3D size / vm->device_block_size; const struct virtio_mem_req req =3D { @@ -1465,9 +1462,6 @@ static int virtio_mem_send_unplug_request(struct virt= io_mem *vm, uint64_t addr, }; int rc =3D -ENOMEM; =20 - if (atomic_read(&vm->config_changed)) - return -EAGAIN; - dev_dbg(&vm->vdev->dev, "unplugging memory: 0x%llx - 0x%llx\n", addr, addr + size - 1); =20 @@ -1489,6 +1483,72 @@ static int virtio_mem_send_unplug_request(struct vir= tio_mem *vm, uint64_t addr, return rc; } =20 +static int virtio_mem_send_plug_request(struct virtio_mem *vm, uint64_t ad= dr, + uint64_t size) +{ + int ret; + + if (atomic_read(&vm->config_changed)) + return -EAGAIN; + + ret =3D _virtio_mem_send_plug_request(vm, addr, size); + if (ret) + return ret; + + /* + * If memory acceptance fails, we cannot safely rollback to the pre-plug + * state because the unplug operation may also fail (e.g., hypervisor + * out of memory, VM migration in progress). Additionally, acceptance + * failures may be partial, leaving some pages accepted and others not, + * creating inconsistent memory state that is difficult to track and + * recover from. + * + * Rather than attempting complex state recovery that may fail, we treat + * acceptance failure as a critical error and return -EINVAL. This causes + * the caller to set the broken flag and stop processing further requests, + * preventing potential memory corruption or system instability. As a + * consequence, the hypervisor-side memory for the failing range is + * leaked for the lifetime of the device. + */ + if (memory_post_plug_call(addr, size)) + return -EINVAL; + + return 0; +} + +static int virtio_mem_send_unplug_request(struct virtio_mem *vm, uint64_t = addr, + uint64_t size) +{ + int ret; + + if (atomic_read(&vm->config_changed)) + return -EAGAIN; + + /* + * If memory release fails, treat it as a critical error similar to + * acceptance failure. See virtio_mem_send_plug_request() for detailed + * rationale on why we avoid complex error recovery. + */ + ret =3D memory_pre_unplug_call(addr, size); + if (ret) + return -EINVAL; + + ret =3D _virtio_mem_send_unplug_request(vm, addr, size); + /* + * If the hypervisor unplug request fails (e.g., out of memory, VM + * migration), the operation will be retried later. Since we already + * released the memory from TDX perspective, we must re-accept it to + * restore consistent state for the next retry. If re-acceptance fails, + * treat it as critical error to prevent state corruption. As a + * consequence, the hypervisor-side memory for the failing range is + * leaked for the lifetime of the device. + */ + if (ret && memory_post_plug_call(addr, size)) + return -EINVAL; + + return ret; +} + static int virtio_mem_send_unplug_all_request(struct virtio_mem *vm) { const struct virtio_mem_req req =3D { --=20 2.52.0 From nobody Mon Jun 8 09:49:34 2026 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.8]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E174237756F for ; Thu, 4 Jun 2026 09:36:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.8 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780565793; cv=none; b=o7EYmLFg76eVD4htX+EU8gvzLwlMXVeq++zx5lDDHnRkks0zZxZKLHEIuV1VzCXS6kh6Zm5ej0X3GZyobNQPkIsWEKYAsSr2+ovlBKpeQpbozECuE8vHFu+Ov1/bO623XkuxmwNp9VUOoNTjh9vqLGNXvVfEmX26Gj0Vmc7mKEw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780565793; c=relaxed/simple; bh=vQ+0bXUeGsFkql9TL2Ei6x4UXxGV/lt8a2RwxDm5sTY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=omYtUWw+TVLJldxfdNZgJSeuVzATm1rml7D+inAzEKixgges2zgdC9oy8Y/VT6zdDu8gJvINzzgW5IX12BDNENba4Z52aBhqvuAzhjZZylho/5qJv0gMff8zBNa4H+Huo8mT+6r725cllKszbT1S7WCoGQi/JH8BmTitH5rcAL4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=Xxq9jc+n; arc=none smtp.client-ip=192.198.163.8 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Xxq9jc+n" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1780565792; x=1812101792; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=vQ+0bXUeGsFkql9TL2Ei6x4UXxGV/lt8a2RwxDm5sTY=; b=Xxq9jc+nk/4EbLEzSJ+HAI9Vn8boE3bx9JXJ5Qu1UaYW0/by37ZZ1mrH TChx9ZNtCF1nIJPqvRmNyLs2FTr+f/ladszMyMatqDLowvODhhpMMCx6i 2MICl7vHDjK2LMI/ealwoIB+eRrTKJo0SkWqYWobGSlhZkfBPMBv0Wpqw F+bZnKpj3ref4NJgkSN8/UuU6tcf1/6BA3BtxYzi4mfVDEPDjf1o4hA3t 9yc4bubSOXjq1esgr2xIoWp9F5oBV8OEw/D78Fe5p/YTLSc23NhBRRkmL x1H0cJkiSexwe89LnXchwvnFiH+C7Aa0eFiPo2WDSYm7uAMCI8i6VPxfF A==; X-CSE-ConnectionGUID: bI9/YTlxTXO9cYM/NIHzfw== X-CSE-MsgGUID: gKR7T4anQHuofSrSxd3XNg== X-IronPort-AV: E=McAfee;i="6800,10657,11806"; a="98963755" X-IronPort-AV: E=Sophos;i="6.24,186,1774335600"; d="scan'208";a="98963755" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by fmvoesa102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Jun 2026 02:36:32 -0700 X-CSE-ConnectionGUID: DwkJXHW3RpKukFMOF92sYg== X-CSE-MsgGUID: gdZ9mf8USL+aYf+V9Zt/tw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.24,186,1774335600"; d="scan'208";a="240037659" Received: from unknown (HELO gnr-sp-2s-612.sh.intel.com) ([10.112.230.229]) by fmviesa006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Jun 2026 02:36:27 -0700 From: Zhenzhong Duan To: marcandre.lureau@redhat.com, david@kernel.org, kas@kernel.org, rick.p.edgecombe@intel.com, prsampat@amd.com, pbonzini@redhat.com, mst@redhat.com, peterx@redhat.com, chenyi.qiang@intel.com, elena.reshetova@intel.com, michaeluth@amd.com, ackerleytng@google.com Cc: linux-kernel@vger.kernel.org, linux-coco@lists.linux.dev, virtualization@lists.linux.dev, x86@kernel.org, yilun.xu@intel.com, xiaoyao.li@intel.com, chao.p.peng@intel.com Subject: [RFC PATCH 4/6] x86/tdx: Register memory post-plug callback for TDX guests Date: Thu, 4 Jun 2026 05:35:49 -0400 Message-ID: <20260604093551.1511079-5-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.52.0 In-Reply-To: <20260604093551.1511079-1-zhenzhong.duan@intel.com> References: <20260604093551.1511079-1-zhenzhong.duan@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Register a callback to handle memory acceptance after memory plugging in TDX guests. When memory is added by virtio-mem or other memory hotplug drivers, the TDX guest must accept the memory pages using TDG.MEM.PAGE.ACCEPT TDCALL before they can be safely accessed. The callback uses the existing tdx_accept_memory() function to accept all pages in the newly plugged memory range. Without this callback, newly added memory would remain in "unaccepted" state, and any access to these pages would trigger VM exits and potentially cause guest crashes. The callback is registered during TDX setup and remains active for the lifetime of the guest, ensuring all dynamically added memory is properly accepted before being made available to the kernel's memory management subsystem. Signed-off-by: Zhenzhong Duan --- arch/x86/coco/tdx/tdx.c | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c index 186915a17c50..d93ba092d311 100644 --- a/arch/x86/coco/tdx/tdx.c +++ b/arch/x86/coco/tdx/tdx.c @@ -326,6 +326,25 @@ static void reduce_unnecessary_ve(void) enable_cpu_topology_enumeration(); } =20 +static int tdx_memory_post_plug(u64 addr, u64 size) +{ + u64 end; + + if (!PAGE_ALIGNED(addr) || !PAGE_ALIGNED(size)) + return -EINVAL; + + if (check_add_overflow(addr, size, &end)) + return -EINVAL; + + if (tdx_accept_memory(addr, end)) + return 0; + + pr_err("Failed to accept memory [0x%llx, 0x%llx)\n", + (unsigned long long)addr, (unsigned long long)end); + + return -EINVAL; +} + static void tdx_setup(u64 *cc_mask) { struct tdx_module_args args =3D {}; @@ -359,6 +378,8 @@ static void tdx_setup(u64 *cc_mask) disable_sept_ve(td_attr); =20 reduce_unnecessary_ve(); + + set_memory_post_plug_callback(tdx_memory_post_plug); } =20 /* --=20 2.52.0 From nobody Mon Jun 8 09:49:34 2026 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.8]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3B4CF395AE1 for ; Thu, 4 Jun 2026 09:36:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.8 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780565797; cv=none; b=q83F+0IwXLue2tg+Gz2JL/OrMXkGacCLJ+MPR2XV7b4UQCGdxBkNKkQlRSiqUD3oZ/gAASil06WpfT2gz9RCNJssiQGC6JvPgUdqofue8LAqCDREAxo0UFhmgGDiscSD5swDaZPdcyksb5c/v6wJ+N3VLu4n1HVUHfjagl/7lb0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780565797; c=relaxed/simple; bh=W+ToqG9UJJgLyoMXwSHACk03TV9dob0lpMj9TjgGNFU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=cT1LqMwU31Ivva/aDp6YNPxIhCmCwhFhQYVkU7af6zK5i/kt1BxffzqV3BQrEzLFBerNwkOJEggVdU6Q7nxq/LPT2+5puzze7OkJi6nS/FIX97J8+99GbISQw1c+MuUbUYOsfSiAdsAEiDcljh/aFsDPwle0Vno3OcefRC1Qtrc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=g76MlPG1; arc=none smtp.client-ip=192.198.163.8 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="g76MlPG1" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1780565796; x=1812101796; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=W+ToqG9UJJgLyoMXwSHACk03TV9dob0lpMj9TjgGNFU=; b=g76MlPG1ku9uxYNdIVI2SHPGfYIjuAuAGMAdtqNi70PXwk9ZiHctNxoK yN7qC/SAJw0RBmn5z+YN1SrUMQ3WlY2m8qxkV0MFjV0V985hULSCQlQaV zo2imHfKDxjGXDUOiF33aMcv0zfW+hlwrw+wBo9IBgQHBQQNFhUOw7VaE ywWSJ4nPC5/LiZFG2xyWmAvUi+XBStB3s/e3X9dkBJMTyh2PIdFaiHqaM yCBhzDrJSsRPnoC5VjWK4SD4I8I+Zmii3VHsxK3qwpp/Ah8lFbLw+KT3c S7SK0215J0k92CH2U6T9nE7PfT6W0BJShbWMi4LcUWX4lZ6y28MQ2oezV w==; X-CSE-ConnectionGUID: BU407VVHQZKc6vII38Yqyw== X-CSE-MsgGUID: 5y7SEUG7QQOsi7JYx+cFDw== X-IronPort-AV: E=McAfee;i="6800,10657,11806"; a="98963764" X-IronPort-AV: E=Sophos;i="6.24,186,1774335600"; d="scan'208";a="98963764" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by fmvoesa102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Jun 2026 02:36:36 -0700 X-CSE-ConnectionGUID: nl1W+94jQ7mTpPyjqyTB6A== X-CSE-MsgGUID: sHIVbJWnQCmCM6dfC7CTTQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.24,186,1774335600"; d="scan'208";a="240037685" Received: from unknown (HELO gnr-sp-2s-612.sh.intel.com) ([10.112.230.229]) by fmviesa006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Jun 2026 02:36:32 -0700 From: Zhenzhong Duan To: marcandre.lureau@redhat.com, david@kernel.org, kas@kernel.org, rick.p.edgecombe@intel.com, prsampat@amd.com, pbonzini@redhat.com, mst@redhat.com, peterx@redhat.com, chenyi.qiang@intel.com, elena.reshetova@intel.com, michaeluth@amd.com, ackerleytng@google.com Cc: linux-kernel@vger.kernel.org, linux-coco@lists.linux.dev, virtualization@lists.linux.dev, x86@kernel.org, yilun.xu@intel.com, xiaoyao.li@intel.com, chao.p.peng@intel.com Subject: [RFC PATCH 5/6] x86/tdx: Register memory pre-unplug callback for TDX guests Date: Thu, 4 Jun 2026 05:35:50 -0400 Message-ID: <20260604093551.1511079-6-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.52.0 In-Reply-To: <20260604093551.1511079-1-zhenzhong.duan@intel.com> References: <20260604093551.1511079-1-zhenzhong.duan@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add support for releasing memory pages before unplugging in TDX guests. When memory is about to be unplugged by virtio-mem or other memory hotplug drivers, the TDX guest should release the memory pages back to the hypervisor using TDG.MEM.PAGE.RELEASE TDCALL to be more robust for buggy VMM behavior, e.g., VMM may do nothing for unplug request. The implementation detects TDG.MEM.PAGE.RELEASE support and optimizes release operations by trying larger page sizes 1G/2M before falling back to 4K pages. If release fails, the function re-accepts any released pages to maintain consistency. Without proper memory release, re-plugging memory in TDX guests fails when guest accepts those memory because hypervisor can do no-op to memory unplug request and memory is already in "accepted" state. Signed-off-by: Zhenzhong Duan --- arch/x86/include/asm/shared/tdx.h | 2 + arch/x86/coco/tdx/tdx.c | 135 ++++++++++++++++++++++++++++++ 2 files changed, 137 insertions(+) diff --git a/arch/x86/include/asm/shared/tdx.h b/arch/x86/include/asm/share= d/tdx.h index 049638e3da74..910ec1e57528 100644 --- a/arch/x86/include/asm/shared/tdx.h +++ b/arch/x86/include/asm/shared/tdx.h @@ -19,6 +19,7 @@ #define TDG_MEM_PAGE_ACCEPT 6 #define TDG_VM_RD 7 #define TDG_VM_WR 8 +#define TDG_MEM_PAGE_RELEASE 30 =20 /* TDX TD attributes */ #define TDX_TD_ATTR_DEBUG_BIT 0 @@ -54,6 +55,7 @@ =20 /* TDCS_CONFIG_FLAGS bits */ #define TDCS_CONFIG_FLEXIBLE_PENDING_VE BIT_ULL(1) +#define TDCS_CONFIG_PAGE_RELEASE BIT_ULL(6) =20 /* TDCS_TD_CTLS bits */ #define TD_CTLS_PENDING_VE_DISABLE_BIT 0 diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c index d93ba092d311..0abfb3505093 100644 --- a/arch/x86/coco/tdx/tdx.c +++ b/arch/x86/coco/tdx/tdx.c @@ -345,6 +345,139 @@ static int tdx_memory_post_plug(u64 addr, u64 size) return -EINVAL; } =20 +static bool tdx_page_release_supported; + +static void detect_mem_page_release(void) +{ + u64 config =3D 0; + + tdg_vm_rd(TDCS_CONFIG_FLAGS, &config); + + tdx_page_release_supported =3D !!(config & TDCS_CONFIG_PAGE_RELEASE); +} + +static unsigned long try_release_one(phys_addr_t start, unsigned long len, + enum pg_level pg_level) +{ + unsigned long release_size =3D page_level_size(pg_level); + struct tdx_module_args args =3D {}; + u8 page_size; + u64 ret; + + if (!IS_ALIGNED(start, release_size)) + return 0; + + if (len < release_size) + return 0; + + /* + * Pass the page physical address to TDX module to release the + * private page and to put it in PENDING state. + * + * Bits 2:0 of RCX encode page size: 0 - 4K, 1 - 2M, 2 - 1G. + */ + switch (pg_level) { + case PG_LEVEL_4K: + page_size =3D TDX_PS_4K; + break; + case PG_LEVEL_2M: + page_size =3D TDX_PS_2M; + break; + case PG_LEVEL_1G: + page_size =3D TDX_PS_1G; + break; + default: + return 0; + } + + args.rcx =3D start | page_size; + ret =3D __tdcall(TDG_MEM_PAGE_RELEASE, &args); + if (ret) + return 0; + + return release_size; +} + +static bool _tdx_release_memory(phys_addr_t start, phys_addr_t end, phys_a= ddr_t *cur) +{ + *cur =3D start; + + while (*cur < end) { + unsigned long len =3D end - *cur; + unsigned long release_size; + + /* + * Try larger release first. It speeds up process by cutting + * number of hypercalls (if successful). + */ + + release_size =3D try_release_one(*cur, len, PG_LEVEL_1G); + if (!release_size) + release_size =3D try_release_one(*cur, len, PG_LEVEL_2M); + if (!release_size) + release_size =3D try_release_one(*cur, len, PG_LEVEL_4K); + if (!release_size) + return false; + *cur +=3D release_size; + } + + return true; +} + +/* + * Release memory pages back to the hypervisor in TDX guests. + * + * @start: Physical start address of memory range to release + * @end: Physical end address of memory range to release + * + * Uses TDG.MEM.PAGE.RELEASE TDCALL to transition private pages back to + * pending state. If PAGE_RELEASE is not supported by the TDX + * configuration, returns true (success) as no action is needed. + * + * On partial failure, automatically re-accepts any successfully released + * pages to restore consistent memory state. Re-acceptance failure is + * treated as a fatal error since it indicates severe TDX module issues. + * + * Returns: true on success, false on failure + */ +static bool tdx_release_memory(phys_addr_t start, phys_addr_t end) +{ + phys_addr_t released =3D start; + bool ret; + + if (!tdx_page_release_supported) + return true; + + ret =3D _tdx_release_memory(start, end, &released); + if (!ret) { + pr_err("Failed to release memory [0x%llx, 0x%llx)\n", + (unsigned long long)start, (unsigned long long)end); + + /* + * Re-accept any pages that were successfully released before + * the failure occurred. This should never fail since we're + * just restoring the previous accepted state. + */ + if (!tdx_accept_memory(start, released)) + panic("%s Failed to re-accept memory\n", __func__); + } + + return ret; +} + +static int tdx_memory_pre_unplug(u64 addr, u64 size) +{ + u64 end; + + if (!PAGE_ALIGNED(addr) || !PAGE_ALIGNED(size)) + return -EINVAL; + + if (check_add_overflow(addr, size, &end)) + return -EINVAL; + + return tdx_release_memory(addr, end) ? 0 : -EINVAL; +} + static void tdx_setup(u64 *cc_mask) { struct tdx_module_args args =3D {}; @@ -380,6 +513,8 @@ static void tdx_setup(u64 *cc_mask) reduce_unnecessary_ve(); =20 set_memory_post_plug_callback(tdx_memory_post_plug); + detect_mem_page_release(); + set_memory_pre_unplug_callback(tdx_memory_pre_unplug); } =20 /* --=20 2.52.0 From nobody Mon Jun 8 09:49:34 2026 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.8]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9086B37C900 for ; Thu, 4 Jun 2026 09:36:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.8 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780565802; cv=none; b=NNwl+/4ztcncvh1IGYt+BqzuGNk4CayC02qbnzyKpJjOps1TU5cSfmA2cDc1aa4ZlHrvavtlJ+qmzSm0L5C/ftUzC72KV/afu6EmVw3j2q2e+Pp6qd/XVkYW8AItBk8VGl71yRkp+UPRvUOdCoIfG/iYvglROIsETpr5FIkJp3o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780565802; c=relaxed/simple; bh=6jYclPervmH6HlTTAT2t7gUE3dTslX3C/i5gj/N/3xg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=KyAW77x2oD+JneuvIJj4EEDElrriebYothXqkmwjLGmKTdDj+LW117D5HldthU3GZJN0UYgV0ZqLkfIvYGjZ48iTbTjhRYWH2Fw1a2fr47cJ4rXiivMH6os9RwHMUdUVriFhQsFrnSHVBJBANKjHbn7rIJ01IfTsrldY3XfeD9U= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=L+NziKD3; arc=none smtp.client-ip=192.198.163.8 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="L+NziKD3" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1780565801; x=1812101801; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=6jYclPervmH6HlTTAT2t7gUE3dTslX3C/i5gj/N/3xg=; b=L+NziKD36OnI56KrxVM6uE1/Xd3BsenoYHtphyGqWIK7FqtiBkgvJO/5 64yEbkQZl8YBcY9RFLfRtFtH7iWfvj6qvuvSfkmTpmwM0tAkMyZGLwRk4 ndyvg6HKu/HeK2b9YCQwxUBJxogf/gZ/mOykTYockAoV9k0bOQRIIoGdZ DZ1BR3lKDSBlC9eIGm9bxZmQ4Z2O1z7X9T01ZcgE7GcrbPyqpD14WIoYx BthFo624TdEV4uzrlsG+s5Kb8/PdtHoImK4lfJL2fs+/6gx0X7e+dXZ2B zpwx3z66Mk9JxsTRHhfe3/T6MGKAYoAnvoZUQwPC7lK1NsQFfqmeMLJnI A==; X-CSE-ConnectionGUID: nKd9+mOEQeW1HW8qE051Gg== X-CSE-MsgGUID: lonqJdd0RqOpiHxlOKGFxQ== X-IronPort-AV: E=McAfee;i="6800,10657,11806"; a="98963772" X-IronPort-AV: E=Sophos;i="6.24,186,1774335600"; d="scan'208";a="98963772" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by fmvoesa102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Jun 2026 02:36:41 -0700 X-CSE-ConnectionGUID: 8or6kYnXSZu78BSejaYkRw== X-CSE-MsgGUID: sQvjen+cTcmzB6ahawbShw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.24,186,1774335600"; d="scan'208";a="240037694" Received: from unknown (HELO gnr-sp-2s-612.sh.intel.com) ([10.112.230.229]) by fmviesa006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Jun 2026 02:36:36 -0700 From: Zhenzhong Duan To: marcandre.lureau@redhat.com, david@kernel.org, kas@kernel.org, rick.p.edgecombe@intel.com, prsampat@amd.com, pbonzini@redhat.com, mst@redhat.com, peterx@redhat.com, chenyi.qiang@intel.com, elena.reshetova@intel.com, michaeluth@amd.com, ackerleytng@google.com Cc: linux-kernel@vger.kernel.org, linux-coco@lists.linux.dev, virtualization@lists.linux.dev, x86@kernel.org, yilun.xu@intel.com, xiaoyao.li@intel.com, chao.p.peng@intel.com Subject: [RFC PATCH 6/6] x86/tdx: Release private memory before private->shared conversion Date: Thu, 4 Jun 2026 05:35:51 -0400 Message-ID: <20260604093551.1511079-7-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.52.0 In-Reply-To: <20260604093551.1511079-1-zhenzhong.duan@intel.com> References: <20260604093551.1511079-1-zhenzhong.duan@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" TDX supports a PAGE.RELEASE feature, when configured, host can only remove a private page until guest releases it and puts it in a PENDING state through TDG.MEM.PAGE.RELEASE. When TDX PAGE.RELEASE is supported, release private memory pages before converting them to shared state, this ensures pages transition from accepted to pending state. The release operation helps handle scenarios where the hypervisor may retain old private pages during conversion. Without proper release, subsequent shared->private conversions could encounter re-acceptance errors when attempting to accept pages that are still in accepted state. If the release operation fails, abort the conversion to prevent inconsistent memory state. Note that if tdx_map_gpa() fails after successful release, we cannot safely rollback because the GPA mapping may have partially succeeded, creating a mix of shared and private pages that cannot be reliably tracked or recovered. Co-developed-by: Xu Yilun Signed-off-by: Xu Yilun Signed-off-by: Zhenzhong Duan --- arch/x86/coco/tdx/tdx.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c index 0abfb3505093..ecee6df92395 100644 --- a/arch/x86/coco/tdx/tdx.c +++ b/arch/x86/coco/tdx/tdx.c @@ -1121,7 +1121,25 @@ static bool tdx_enc_status_changed(unsigned long vad= dr, int numpages, bool enc) { phys_addr_t start =3D __pa(vaddr); phys_addr_t end =3D __pa(vaddr + numpages * PAGE_SIZE); + bool release_required =3D !enc && tdx_page_release_supported; =20 + /* + * For private->shared conversion, release memory pages first. + * This transitions pages from accepted to pending state to be + * more robust with buggy VMM, e.g., VMM may keep old pages, + * when converting back to private, re-accept error triggers. + */ + if (release_required && !tdx_release_memory(start, end)) + return false; + + /* + * Update the GPA mapping state. If this fails, we cannot rollback + * by calling tdx_accept_memory() because tdx_map_gpa() may have + * partially succeeded, creating a mix of shared and private pages. + * Attempting to accept the entire range would fail on pages that + * are still in shared state, and we have no way to determine which + * pages are in which state after partial failure. + */ if (!tdx_map_gpa(start, end, enc)) return false; =20 --=20 2.52.0