From nobody Sat Nov 15 12:16:20 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=intel.com ARC-Seal: i=1; a=rsa-sha256; t=1752550849; cv=none; d=zohomail.com; s=zohoarc; b=B8PvUqnITnHQqNq6i0BAGyOzl0blefYtcMEUp8TokfK/2l1SX46LeK/o0Qf+jjq+hdivOKdOTfNnYDq8++Pu4go6L7el5WCBxJ/anejXk0QdPBEx7v6NkNFngxeShfDd1vcT52r/9ic30gAp9JJgjDTZSxvrE9H3hv/lYuG1qKI= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1752550849; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=Tw/ossK7q0iNgiqLqOBUhnsQJ46QATuvoHYuCOcD4m8=; b=a0Pn0UG9nhRQn46dwkpA4Z9eMbOW1kHAJx1VAXWTIe/GqUTTiDb2okGWMDwL41GmLXuiMFdvOmKNgSLPye//x0BOJQe9+Hx55p2LLW71RIo0yE+NT3VFeYarWFDw+uK5O0bj/jn9XKPZPkyoIZdpo55knFbObfHKbfNdQFuaRBY= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1752550849718702.8040532161846; Mon, 14 Jul 2025 20:40:49 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ubWWQ-00044t-4n; Mon, 14 Jul 2025 23:40:02 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ubWWN-00043p-KG for qemu-devel@nongnu.org; Mon, 14 Jul 2025 23:40:00 -0400 Received: from mgamail.intel.com ([192.198.163.8]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ubWWL-0006RH-Or for qemu-devel@nongnu.org; Mon, 14 Jul 2025 23:39:59 -0400 Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by fmvoesa102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Jul 2025 20:39:56 -0700 Received: from lxy-clx-4s.sh.intel.com ([10.239.48.52]) by fmviesa002.fm.intel.com with ESMTP; 14 Jul 2025 20:39:53 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1752550798; x=1784086798; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=2JiUasKMV6cFYaN1ccluHOyPO7F9xXqlVaP37qhNe+c=; b=kr3VtuctrDuHHF+Bfh6VGjAG8KcJZqyUi2UAROpBpZ4pYmOS69BbPF8z mG3u21inX+eK5USJZIF83f3uDmhGVxEu0Jvy2PjUC+Ck9qG33lgyIroju r6zpGTihrt1FWmPeQQ07GBFDSfdVEBVX+G8m75uhVOAl1gtmHCWQ5+n6O 9Cqk7x/7Oe3KMD0Mb+A1kZKeSWnLuDDaBsGfgdyDp9JGRG5rfI+Vh1Mlv 9DCwr9HUSAWFM5gfS27YRcyHoFTV3bOPrrusdLdBMYuPuO5NzNIgGfGmv b5594vZDRaOFoBH6FmSNmp5aP/zjBS/3s3hey4NsW7yA32BGV5kZI4Eoj A==; X-CSE-ConnectionGUID: GkOi9ru8SLqJ68u3pZ8NgQ== X-CSE-MsgGUID: 8xnsjAJ5S7u0wjdn6qACrg== X-IronPort-AV: E=McAfee;i="6800,10657,11491"; a="72334918" X-IronPort-AV: E=Sophos;i="6.16,312,1744095600"; d="scan'208";a="72334918" X-CSE-ConnectionGUID: J3CmaLuvT3uAowYkYHY13g== X-CSE-MsgGUID: XLuACWExQqyhpIE3jykCvw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,312,1744095600"; d="scan'208";a="180808090" From: Xiaoyao Li To: Paolo Bonzini , David Hildenbrand , ackerleytng@google.com, seanjc@google.com Cc: Fuad Tabba , Vishal Annapurve , rick.p.edgecombe@intel.com, Kai Huang , binbin.wu@linux.intel.com, yan.y.zhao@intel.com, ira.weiny@intel.com, michael.roth@amd.com, kvm@vger.kernel.org, qemu-devel@nongnu.org, Peter Xu , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= Subject: [POC PATCH 1/5] update-linux-headers: Add guestmem.h Date: Tue, 15 Jul 2025 11:31:37 +0800 Message-ID: <20250715033141.517457-2-xiaoyao.li@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250715033141.517457-1-xiaoyao.li@intel.com> References: <20250715033141.517457-1-xiaoyao.li@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=192.198.163.8; envelope-from=xiaoyao.li@intel.com; helo=mgamail.intel.com X-Spam_score_int: -33 X-Spam_score: -3.4 X-Spam_bar: --- X-Spam_report: (-3.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HK_RANDOM_ENVFROM=0.001, HK_RANDOM_FROM=1, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @intel.com) X-ZM-MESSAGEID: 1752550851306116600 Content-Type: text/plain; charset="utf-8" Signed-off-by: Xiaoyao Li --- scripts/update-linux-headers.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/scripts/update-linux-headers.sh b/scripts/update-linux-headers= .sh index b43b8ef75a63..3f6169a121a8 100755 --- a/scripts/update-linux-headers.sh +++ b/scripts/update-linux-headers.sh @@ -200,7 +200,7 @@ rm -rf "$output/linux-headers/linux" mkdir -p "$output/linux-headers/linux" for header in const.h stddef.h kvm.h vfio.h vfio_ccw.h vfio_zdev.h vhost.h= \ psci.h psp-sev.h userfaultfd.h memfd.h mman.h nvme_ioctl.h \ - vduse.h iommufd.h bits.h; do + vduse.h iommufd.h bits.h guestmem.h; do cp "$hdrdir/include/linux/$header" "$output/linux-headers/linux" done =20 --=20 2.43.0 From nobody Sat Nov 15 12:16:20 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=intel.com ARC-Seal: i=1; a=rsa-sha256; t=1752550849; cv=none; d=zohomail.com; s=zohoarc; b=CbOV/pU9NSy3d2zf1b/anAR1vZS1DcRg1Nq6AYsUGxOcueEefKxenMsxo6O8DdxkDJE1FduKFczIOknQqpVgLWO5qhn5hlfzH7urCSZU3PwGNf/7G0GhjL8LU7CoWvBPwnm0HiSjDM2EG/p8eBSPgDTAN6qt65bjKcuUFJhkJAs= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1752550849; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=9//okwp2grTSVDVJq1HYD0Gkr1nUZpULg2mcHX7VX+M=; b=l8fuwpA18a/13HpOUmOsTFh5tsG+e+cH7/wNlJL7V5f8i5jrGV1hcUKGKVSovIsotosCwAzIjUzhDvWRCqMS4uOU9fWLr2d/HWAlVX7ZoXe8mcavdewN4AyxuVeNbkDmEU5IlfMTJzEELZsBOjDHb/m5mw4xKNGszVsFLuh9Mzk= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1752550849718251.41549430645523; Mon, 14 Jul 2025 20:40:49 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ubWWS-00046Z-UY; Mon, 14 Jul 2025 23:40:05 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ubWWR-00045J-1G for qemu-devel@nongnu.org; Mon, 14 Jul 2025 23:40:03 -0400 Received: from mgamail.intel.com ([192.198.163.8]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ubWWO-0006Rk-Vy for qemu-devel@nongnu.org; Mon, 14 Jul 2025 23:40:02 -0400 Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by fmvoesa102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Jul 2025 20:40:00 -0700 Received: from lxy-clx-4s.sh.intel.com ([10.239.48.52]) by fmviesa002.fm.intel.com with ESMTP; 14 Jul 2025 20:39:56 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1752550801; x=1784086801; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Q4SUK8FgUWLQ+4A301fwccsLC3/hT+jdFcVfpJvYn+8=; b=mv2jQ3l+hDHFdztn9cVsxN0L/PssMjyQseLJZHylZQ+MZjN6x8/hCUsq 5xMhxMj2NNF3P7b2crbw57ubGT3nWCwAmW3UBkbEDFbscsorPyyEz+r0k KEq0YwSdYvg8Wp93ZJQ4jCQQt8a6xs3o2jmE+xFon70YiVOf0cOaMawk6 yQr7uWSTk1UX+gqOKBYD2VmPKX7wz56I74MUySlmg7St0BuPKMsu3EebU 8vkfpF2Oc8DTbFc0TPihOzGzUO/mTgWImKDp+4XtTQhFXqTKxpQ+AtVD7 2YQtdAK/u1VSiaH35Is+F+N5+E833plss6Uy/e2d5p0cOZTQUn9mSpYS3 w==; X-CSE-ConnectionGUID: OIa+c+qqTrWpv/GcbcJdEA== X-CSE-MsgGUID: cMOcKwo+Sc2kexls5ARLgA== X-IronPort-AV: E=McAfee;i="6800,10657,11491"; a="72334926" X-IronPort-AV: E=Sophos;i="6.16,312,1744095600"; d="scan'208";a="72334926" X-CSE-ConnectionGUID: xwHQgJYWS6Waa1s9cAT7Hg== X-CSE-MsgGUID: C+4jFUV2S3CD/bdO7Y/0qQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,312,1744095600"; d="scan'208";a="180808100" From: Xiaoyao Li To: Paolo Bonzini , David Hildenbrand , ackerleytng@google.com, seanjc@google.com Cc: Fuad Tabba , Vishal Annapurve , rick.p.edgecombe@intel.com, Kai Huang , binbin.wu@linux.intel.com, yan.y.zhao@intel.com, ira.weiny@intel.com, michael.roth@amd.com, kvm@vger.kernel.org, qemu-devel@nongnu.org, Peter Xu , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= Subject: [POC PATCH 2/5] headers: Fetch gmem updates Date: Tue, 15 Jul 2025 11:31:38 +0800 Message-ID: <20250715033141.517457-3-xiaoyao.li@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250715033141.517457-1-xiaoyao.li@intel.com> References: <20250715033141.517457-1-xiaoyao.li@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=192.198.163.8; envelope-from=xiaoyao.li@intel.com; helo=mgamail.intel.com X-Spam_score_int: -33 X-Spam_score: -3.4 X-Spam_bar: --- X-Spam_report: (-3.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HK_RANDOM_ENVFROM=0.001, HK_RANDOM_FROM=1, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @intel.com) X-ZM-MESSAGEID: 1752550851287116600 Content-Type: text/plain; charset="utf-8" Signed-off-by: Xiaoyao Li --- linux-headers/linux/guestmem.h | 29 +++++++++++++++++++++++++++++ linux-headers/linux/kvm.h | 18 ++++++++++++++++++ 2 files changed, 47 insertions(+) create mode 100644 linux-headers/linux/guestmem.h diff --git a/linux-headers/linux/guestmem.h b/linux-headers/linux/guestmem.h new file mode 100644 index 000000000000..be045fbad230 --- /dev/null +++ b/linux-headers/linux/guestmem.h @@ -0,0 +1,29 @@ +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ +#ifndef _LINUX_GUESTMEM_H +#define _LINUX_GUESTMEM_H + +/* + * Huge page size must be explicitly defined when using the guestmem_huget= lb + * allocator for guest_memfd. It is the responsibility of the application= to + * know which sizes are supported on the running system. See mmap(2) man = page + * for details. + */ + +#define GUESTMEM_HUGETLB_FLAG_SHIFT 58 +#define GUESTMEM_HUGETLB_FLAG_MASK 0x3fUL + +#define GUESTMEM_HUGETLB_FLAG_16KB (14UL << GUESTMEM_HUGETLB_FLAG_SHIFT) +#define GUESTMEM_HUGETLB_FLAG_64KB (16UL << GUESTMEM_HUGETLB_FLAG_SHIFT) +#define GUESTMEM_HUGETLB_FLAG_512KB (19UL << GUESTMEM_HUGETLB_FLAG_SHIFT) +#define GUESTMEM_HUGETLB_FLAG_1MB (20UL << GUESTMEM_HUGETLB_FLAG_SHIFT) +#define GUESTMEM_HUGETLB_FLAG_2MB (21UL << GUESTMEM_HUGETLB_FLAG_SHIFT) +#define GUESTMEM_HUGETLB_FLAG_8MB (23UL << GUESTMEM_HUGETLB_FLAG_SHIFT) +#define GUESTMEM_HUGETLB_FLAG_16MB (24UL << GUESTMEM_HUGETLB_FLAG_SHIFT) +#define GUESTMEM_HUGETLB_FLAG_32MB (25UL << GUESTMEM_HUGETLB_FLAG_SHIFT) +#define GUESTMEM_HUGETLB_FLAG_256MB (28UL << GUESTMEM_HUGETLB_FLAG_SHIFT) +#define GUESTMEM_HUGETLB_FLAG_512MB (29UL << GUESTMEM_HUGETLB_FLAG_SHIFT) +#define GUESTMEM_HUGETLB_FLAG_1GB (30UL << GUESTMEM_HUGETLB_FLAG_SHIFT) +#define GUESTMEM_HUGETLB_FLAG_2GB (31UL << GUESTMEM_HUGETLB_FLAG_SHIFT) +#define GUESTMEM_HUGETLB_FLAG_16GB (34UL << GUESTMEM_HUGETLB_FLAG_SHIFT) + +#endif /* _LINUX_GUESTMEM_H */ diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h index 32c5885a3c20..ff9ef5fb37c5 100644 --- a/linux-headers/linux/kvm.h +++ b/linux-headers/linux/kvm.h @@ -952,6 +952,9 @@ struct kvm_enable_cap { #define KVM_CAP_ARM_EL2 240 #define KVM_CAP_ARM_EL2_E2H0 241 #define KVM_CAP_RISCV_MP_STATE_RESET 242 +#define KVM_CAP_GMEM_SHARED_MEM 240 +#define KVM_CAP_GMEM_CONVERSION 241 +#define KVM_CAP_GMEM_HUGETLB 242 =20 struct kvm_irq_routing_irqchip { __u32 irqchip; @@ -1589,12 +1592,27 @@ struct kvm_memory_attributes { =20 #define KVM_CREATE_GUEST_MEMFD _IOWR(KVMIO, 0xd4, struct kvm_create_guest= _memfd) =20 +#define GUEST_MEMFD_FLAG_SUPPORT_SHARED (1UL << 0) +#define GUEST_MEMFD_FLAG_INIT_PRIVATE (1UL << 1) +#define GUEST_MEMFD_FLAG_HUGETLB (1UL << 2) + struct kvm_create_guest_memfd { __u64 size; __u64 flags; __u64 reserved[6]; }; =20 +#define KVM_GMEM_IO 0xAF +#define KVM_GMEM_CONVERT_SHARED _IOWR(KVM_GMEM_IO, 0x41, struct kvm_gmem= _convert) +#define KVM_GMEM_CONVERT_PRIVATE _IOWR(KVM_GMEM_IO, 0x42, struct kvm_gmem= _convert) + +struct kvm_gmem_convert { + __u64 offset; + __u64 size; + __u64 error_offset; + __u64 reserved[5]; +}; + #define KVM_PRE_FAULT_MEMORY _IOWR(KVMIO, 0xd5, struct kvm_pre_fault_memor= y) =20 struct kvm_pre_fault_memory { --=20 2.43.0 From nobody Sat Nov 15 12:16:20 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=intel.com ARC-Seal: i=1; a=rsa-sha256; t=1752550916; cv=none; d=zohomail.com; s=zohoarc; b=bLSxAzN9zd9bKmc00nafuUZbtI/Jf0ROfvCX4/TIe8rVt+t56X5hd8qBXJKcvUGnxuzIVk7A7L7QBBDbN/Px5TB0cPhflc9TU2u8fOy85lWHH/2Eo3k468boL/DOLokAmasFFh2wKBcBMApTw16LYxKN6TzT6G/ZyJElvlS3Ru0= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1752550916; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=2oaVx1rsohD7aOj7Eg9cXKWNcV/h3XL0dwxuSvXuL7k=; b=ON0fYZZWTb3XOGkLMUq3D1lqB486Ej+8m8cvy68gL6MOSBtVsJzo+A7zoHy6pfh3lURuGba4GjX77Euid9IMukFhIFHdpwENbtN4dauOtAY3k+GeGW2SgALPAGbxQdkii7PojaK1ipMOYkNRCitzChfywwf/CQCeuGX+IKTjAQk= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1752550916608121.72820887409137; Mon, 14 Jul 2025 20:41:56 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ubWWW-00048U-D3; Mon, 14 Jul 2025 23:40:08 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ubWWU-000479-UE for qemu-devel@nongnu.org; Mon, 14 Jul 2025 23:40:06 -0400 Received: from mgamail.intel.com ([192.198.163.8]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ubWWS-0006Rk-B0 for qemu-devel@nongnu.org; Mon, 14 Jul 2025 23:40:06 -0400 Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by fmvoesa102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Jul 2025 20:40:04 -0700 Received: from lxy-clx-4s.sh.intel.com ([10.239.48.52]) by fmviesa002.fm.intel.com with ESMTP; 14 Jul 2025 20:40:00 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1752550804; x=1784086804; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=X3NUwnnatzcmEwzonfTfmIa0wURgw66ax6O0/XgE1Hs=; b=Myv1x+1o40MSvwfPZR4/kvk3rl6iD8CtGp5myA97j8l84E0OU7Jw38EX mFIO3h8Mmjest4C9VLr7oICtn6sMB1uJw0mbootC3ghXfPDl+zf7C+ZcI 5B0JKwgwxxA8hEGHGH2/RY0O1GizWMazGRhyoP6mfUEJOmuFVf6+Az66/ rOZRcphO1IT+Y70lcNSyxqMjnCU9eUgKWROjlKNC+bgFJIEbd0Ysky6W+ 5w3qaZ+5N2hGEECx0CL9YrNTO/aAVPZV5LXE8zuShfw2p66ulDxC5PbBy 5yFEs2qXaVLvJG41PVPR7opOCErF+ao1ZIadqFQIIPb3aK6JrcpE5H0GE A==; X-CSE-ConnectionGUID: d2siE0CHQCmkc1hVs/EWqA== X-CSE-MsgGUID: YS9Bd8ZjQZy3LdMvG49yMg== X-IronPort-AV: E=McAfee;i="6800,10657,11491"; a="72334933" X-IronPort-AV: E=Sophos;i="6.16,312,1744095600"; d="scan'208";a="72334933" X-CSE-ConnectionGUID: Q9Y6B1MuRZaPmH9e8CbT6g== X-CSE-MsgGUID: z+dqKtX3TwivifMeWRVZlA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,312,1744095600"; d="scan'208";a="180808120" From: Xiaoyao Li To: Paolo Bonzini , David Hildenbrand , ackerleytng@google.com, seanjc@google.com Cc: Fuad Tabba , Vishal Annapurve , rick.p.edgecombe@intel.com, Kai Huang , binbin.wu@linux.intel.com, yan.y.zhao@intel.com, ira.weiny@intel.com, michael.roth@amd.com, kvm@vger.kernel.org, qemu-devel@nongnu.org, Peter Xu , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= Subject: [POC PATCH 3/5] memory/guest_memfd: Enable in-place conversion when available Date: Tue, 15 Jul 2025 11:31:39 +0800 Message-ID: <20250715033141.517457-4-xiaoyao.li@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250715033141.517457-1-xiaoyao.li@intel.com> References: <20250715033141.517457-1-xiaoyao.li@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=192.198.163.8; envelope-from=xiaoyao.li@intel.com; helo=mgamail.intel.com X-Spam_score_int: -33 X-Spam_score: -3.4 X-Spam_bar: --- X-Spam_report: (-3.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HK_RANDOM_ENVFROM=0.001, HK_RANDOM_FROM=1, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @intel.com) X-ZM-MESSAGEID: 1752550917856116600 Content-Type: text/plain; charset="utf-8" From: Yan Zhao (This is just the POC code to use in-place conversion gmem.) Try to use in-place conversion gmem when it is supported. When in-place conversion is enabled, there is no need to discard memory since it still needs to be used as the memory of opposite attribute after conversion. For a upstreamable solution, we can introduce memory-backend-guestmemfd for in-place conversion. With the non in-place conversion, it needs seperate non-gmem memory to back the shared memory and gmem is created implicitly and internally based on vm type. While with in-place conversion, there is no need for seperate non-gmem memory because gmem itself can be served as shared memory. So that we can introduce memory-backend-guestmemfd as the specific backend for in-place conversion gmem. Signed-off-by: Yan Zhao Co-developed-by Xiaoyao Li Signed-off-by: Xiaoyao Li --- accel/kvm/kvm-all.c | 79 ++++++++++++++++++++++++++++----------- accel/stubs/kvm-stub.c | 1 + include/system/kvm.h | 1 + include/system/memory.h | 2 + include/system/ramblock.h | 1 + system/memory.c | 7 ++++ system/physmem.c | 21 ++++++++++- 7 files changed, 90 insertions(+), 22 deletions(-) diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index a106d1ba0f0b..609537738d38 100644 --- a/accel/kvm/kvm-all.c +++ b/accel/kvm/kvm-all.c @@ -105,6 +105,7 @@ static int kvm_sstep_flags; static bool kvm_immediate_exit; static uint64_t kvm_supported_memory_attributes; static bool kvm_guest_memfd_supported; +bool kvm_guest_memfd_inplace_supported; static hwaddr kvm_max_slot_size =3D ~0; =20 static const KVMCapabilityInfo kvm_required_capabilites[] =3D { @@ -1487,6 +1488,30 @@ static int kvm_set_memory_attributes(hwaddr start, u= int64_t size, uint64_t attr) return r; } =20 +static int kvm_set_guest_memfd_shareability(MemoryRegion *mr, ram_addr_t o= ffset, + uint64_t size, bool shared) +{ + int guest_memfd =3D mr->ram_block->guest_memfd; + struct kvm_gmem_convert param =3D { + .offset =3D offset, + .size =3D size, + .error_offset =3D 0, + }; + unsigned long op; + int r; + + op =3D shared ? KVM_GMEM_CONVERT_SHARED : KVM_GMEM_CONVERT_PRIVATE; + + r =3D ioctl(guest_memfd, op, ¶m); + if (r) { + error_report("failed to set guest_memfd offset 0x%lx size 0x%lx to= %s " + "error '%s' error offset 0x%llx", + offset, size, shared ? "shared" : "private", + strerror(errno), param.error_offset); + } + return r; +} + int kvm_set_memory_attributes_private(hwaddr start, uint64_t size) { return kvm_set_memory_attributes(start, size, KVM_MEMORY_ATTRIBUTE_PRI= VATE); @@ -1604,7 +1629,8 @@ static void kvm_set_phys_mem(KVMMemoryListener *kml, abort(); } =20 - if (memory_region_has_guest_memfd(mr)) { + if (memory_region_has_guest_memfd(mr) && + !memory_region_guest_memfd_in_place_conversion(mr)) { err =3D kvm_set_memory_attributes_private(start_addr, slot_siz= e); if (err) { error_report("%s: failed to set memory attribute private: = %s", @@ -2779,6 +2805,9 @@ static int kvm_init(AccelState *as, MachineState *ms) kvm_check_extension(s, KVM_CAP_GUEST_MEMFD) && kvm_check_extension(s, KVM_CAP_USER_MEMORY2) && (kvm_supported_memory_attributes & KVM_MEMORY_ATTRIBUTE_PRIVATE); + kvm_guest_memfd_inplace_supported =3D + kvm_check_extension(s, KVM_CAP_GMEM_SHARED_MEM) && + kvm_check_extension(s, KVM_CAP_GMEM_CONVERSION); kvm_pre_fault_memory_supported =3D kvm_vm_check_extension(s, KVM_CAP_P= RE_FAULT_MEMORY); =20 if (s->kernel_irqchip_split =3D=3D ON_OFF_AUTO_AUTO) { @@ -3056,6 +3085,7 @@ static void kvm_eat_signals(CPUState *cpu) =20 int kvm_convert_memory(hwaddr start, hwaddr size, bool to_private) { + bool in_place_conversion =3D false; MemoryRegionSection section; ram_addr_t offset; MemoryRegion *mr; @@ -3112,18 +3142,23 @@ int kvm_convert_memory(hwaddr start, hwaddr size, b= ool to_private) goto out_unref; } =20 - if (to_private) { - ret =3D kvm_set_memory_attributes_private(start, size); - } else { - ret =3D kvm_set_memory_attributes_shared(start, size); - } - if (ret) { - goto out_unref; - } - addr =3D memory_region_get_ram_ptr(mr) + section.offset_within_region; rb =3D qemu_ram_block_from_host(addr, false, &offset); =20 + in_place_conversion =3D memory_region_guest_memfd_in_place_conversion(= mr); + if (in_place_conversion) { + ret =3D kvm_set_guest_memfd_shareability(mr, offset, size, !to_pri= vate); + } else { + if (to_private) { + ret =3D kvm_set_memory_attributes_private(start, size); + } else { + ret =3D kvm_set_memory_attributes_shared(start, size); + } + } + if (ret) { + goto out_unref; + } + ret =3D ram_block_attributes_state_change(RAM_BLOCK_ATTRIBUTES(mr->rdm= ), offset, size, to_private); if (ret) { @@ -3133,17 +3168,19 @@ int kvm_convert_memory(hwaddr start, hwaddr size, b= ool to_private) goto out_unref; } =20 - if (to_private) { - if (rb->page_size !=3D qemu_real_host_page_size()) { - /* - * shared memory is backed by hugetlb, which is supposed to be - * pre-allocated and doesn't need to be discarded - */ - goto out_unref; - } - ret =3D ram_block_discard_range(rb, offset, size); - } else { - ret =3D ram_block_discard_guest_memfd_range(rb, offset, size); + if (!in_place_conversion) { + if (to_private) { + if (rb->page_size !=3D qemu_real_host_page_size()) { + /* + * shared memory is backed by hugetlb, which is supposed to= be + * pre-allocated and doesn't need to be discarded + */ + goto out_unref; + } + ret =3D ram_block_discard_range(rb, offset, size); + } else { + ret =3D ram_block_discard_guest_memfd_range(rb, offset, size); + } } =20 out_unref: diff --git a/accel/stubs/kvm-stub.c b/accel/stubs/kvm-stub.c index 68cd33ba9735..bf0ccae27b62 100644 --- a/accel/stubs/kvm-stub.c +++ b/accel/stubs/kvm-stub.c @@ -24,6 +24,7 @@ bool kvm_gsi_direct_mapping; bool kvm_allowed; bool kvm_readonly_mem_allowed; bool kvm_msi_use_devid; +bool kvm_guest_memfd_inplace_supported; =20 void kvm_flush_coalesced_mmio_buffer(void) { diff --git a/include/system/kvm.h b/include/system/kvm.h index 3c7d31473663..32f2be5f92e1 100644 --- a/include/system/kvm.h +++ b/include/system/kvm.h @@ -43,6 +43,7 @@ extern bool kvm_gsi_direct_mapping; extern bool kvm_readonly_mem_allowed; extern bool kvm_msi_use_devid; extern bool kvm_pre_fault_memory_supported; +extern bool kvm_guest_memfd_inplace_supported; =20 #define kvm_enabled() (kvm_allowed) /** diff --git a/include/system/memory.h b/include/system/memory.h index 46248d4a52c4..f14fbf65805d 100644 --- a/include/system/memory.h +++ b/include/system/memory.h @@ -1812,6 +1812,8 @@ bool memory_region_is_protected(MemoryRegion *mr); */ bool memory_region_has_guest_memfd(MemoryRegion *mr); =20 +bool memory_region_guest_memfd_in_place_conversion(MemoryRegion *mr); + /** * memory_region_get_iommu: check whether a memory region is an iommu * diff --git a/include/system/ramblock.h b/include/system/ramblock.h index 87e847e184aa..87757940ea21 100644 --- a/include/system/ramblock.h +++ b/include/system/ramblock.h @@ -46,6 +46,7 @@ struct RAMBlock { int fd; uint64_t fd_offset; int guest_memfd; + uint64_t guest_memfd_flags; RamBlockAttributes *attributes; size_t page_size; /* dirty bitmap used during migration */ diff --git a/system/memory.c b/system/memory.c index e8d9b15b28f6..6870a41629ef 100644 --- a/system/memory.c +++ b/system/memory.c @@ -35,6 +35,7 @@ =20 #include "memory-internal.h" =20 +#include //#define DEBUG_UNASSIGNED =20 static unsigned memory_region_transaction_depth; @@ -1878,6 +1879,12 @@ bool memory_region_has_guest_memfd(MemoryRegion *mr) return mr->ram_block && mr->ram_block->guest_memfd >=3D 0; } =20 +bool memory_region_guest_memfd_in_place_conversion(MemoryRegion *mr) +{ + return mr && memory_region_has_guest_memfd(mr) && + (mr->ram_block->guest_memfd_flags & GUEST_MEMFD_FLAG_SUPPORT_SH= ARED); +} + uint8_t memory_region_get_dirty_log_mask(MemoryRegion *mr) { uint8_t mask =3D mr->dirty_log_mask; diff --git a/system/physmem.c b/system/physmem.c index 130c148ffb5c..955480685310 100644 --- a/system/physmem.c +++ b/system/physmem.c @@ -89,6 +89,9 @@ =20 #include "memory-internal.h" =20 +#include +#include + //#define DEBUG_SUBPAGE =20 /* ram_list is read under rcu_read_lock()/rcu_read_unlock(). Writes @@ -1913,6 +1916,9 @@ static void ram_block_add(RAMBlock *new_block, Error = **errp) =20 if (new_block->flags & RAM_GUEST_MEMFD) { int ret; + bool in_place =3D kvm_guest_memfd_inplace_supported; + + new_block->guest_memfd_flags =3D 0; =20 if (!kvm_enabled()) { error_setg(errp, "cannot set up private guest memory for %s: K= VM required", @@ -1929,13 +1935,26 @@ static void ram_block_add(RAMBlock *new_block, Erro= r **errp) goto out_free; } =20 + if (in_place) { + new_block->guest_memfd_flags |=3D GUEST_MEMFD_FLAG_SUPPORT_SHA= RED | + GUEST_MEMFD_FLAG_INIT_PRIVATE; + } + new_block->guest_memfd =3D kvm_create_guest_memfd(new_block->max_l= ength, - 0, errp); + new_block->guest_memfd_flags, errp); if (new_block->guest_memfd < 0) { qemu_mutex_unlock_ramlist(); goto out_free; } =20 + if (in_place) { + qemu_ram_munmap(new_block->fd, new_block->host, new_block->max= _length); + new_block->host =3D qemu_ram_mmap(new_block->guest_memfd, + new_block->max_length, + QEMU_VMALLOC_ALIGN, + QEMU_MAP_SHARED, 0); + } + /* * The attribute bitmap of the RamBlockAttributes is default to * discarded, which mimics the behavior of kvm_set_phys_mem() when= it --=20 2.43.0 From nobody Sat Nov 15 12:16:20 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=intel.com ARC-Seal: i=1; a=rsa-sha256; t=1752550857; cv=none; d=zohomail.com; s=zohoarc; b=DaOhBOZVuTMViXlJfClP2Agb5pqXps96T0cb81+RfjsHEDWtF2/qYleWwOTw1AsxddaICPYgshc9wvHHPqG4CzGbd7l+PFetVVAcmke9BJcgQoakm4sB/y5bZaWINjAyGUMV1y0YgYX5TXb7swIEHB3EwuXhd6nEsrIWi1XsPYU= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1752550857; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=fKkGPc1QzdYuySQapWjMTAOxoSCk5X7NWG41AdPHzRc=; b=LYsSxvEOXAKvpmRNWCuoPh6MrVthgTStNKRZch7lps9uzLhZ2LKelkJY2YylX7dw8Y5XfvgB8SPOSbchMfI8LmZtINbpWL1KGUqR5rKwdBWLxe9o8glyzcDNEDeiHRcR1wu9oG9fF15csvYcoUaIyxbvHKxOo5Yek/8J6BC+DAs= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1752550857806395.55623034381654; Mon, 14 Jul 2025 20:40:57 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ubWWa-0004Ak-Ed; Mon, 14 Jul 2025 23:40:12 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ubWWX-00049S-T0 for qemu-devel@nongnu.org; Mon, 14 Jul 2025 23:40:10 -0400 Received: from mgamail.intel.com ([192.198.163.8]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ubWWV-0006Rk-Rh for qemu-devel@nongnu.org; Mon, 14 Jul 2025 23:40:09 -0400 Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by fmvoesa102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Jul 2025 20:40:07 -0700 Received: from lxy-clx-4s.sh.intel.com ([10.239.48.52]) by fmviesa002.fm.intel.com with ESMTP; 14 Jul 2025 20:40:04 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1752550808; x=1784086808; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=337nkqGcMNjO0TXjHY3O+IoJvy4v6WC372UutXb1xJ4=; b=Fvz5MghPudgFqGuGi/U+ekKWXxP6Gqbnl22Pna5L88KmH/pNwdEFZC5Z +zL5eo5knTh58pPgEHUz3SF5XzZkSDfHAI4Hvp0+qb+m+eo0uwezVM5MR zgvWbw9JWwq+FY6EH2g1vbphjs68+BdL5RDlNEfOmAl5vZFud7eI0iqEv 0HUgJFD61VI26FpuHCzuPJ5rD/G9jPThCRzjYyU8drIqSV1NgqIN51Kdj OprEGwW3xhtN+Gg7BKDsTnhzO/k0qpr7i8OuxMaF0/a7qitPpGYmMx9dW d+mIRbCF9m4TrMaAwos4iE7teU4BptjOB5AlW56twOAxMGYxuyoku5CTq g==; X-CSE-ConnectionGUID: B4Dpc/shSM+G2Yg3LWzn5A== X-CSE-MsgGUID: EeU0YNXySK6DBAD4z0TLsA== X-IronPort-AV: E=McAfee;i="6800,10657,11491"; a="72334940" X-IronPort-AV: E=Sophos;i="6.16,312,1744095600"; d="scan'208";a="72334940" X-CSE-ConnectionGUID: BySPT4tMS0OtYrtHF6wKWQ== X-CSE-MsgGUID: WfFHQ5O7Q6qMylZNFgEpnQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,312,1744095600"; d="scan'208";a="180808164" From: Xiaoyao Li To: Paolo Bonzini , David Hildenbrand , ackerleytng@google.com, seanjc@google.com Cc: Fuad Tabba , Vishal Annapurve , rick.p.edgecombe@intel.com, Kai Huang , binbin.wu@linux.intel.com, yan.y.zhao@intel.com, ira.weiny@intel.com, michael.roth@amd.com, kvm@vger.kernel.org, qemu-devel@nongnu.org, Peter Xu , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= Subject: [POC PATCH 4/5] memory/guest_memfd: Enable hugetlb support Date: Tue, 15 Jul 2025 11:31:40 +0800 Message-ID: <20250715033141.517457-5-xiaoyao.li@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250715033141.517457-1-xiaoyao.li@intel.com> References: <20250715033141.517457-1-xiaoyao.li@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=192.198.163.8; envelope-from=xiaoyao.li@intel.com; helo=mgamail.intel.com X-Spam_score_int: -33 X-Spam_score: -3.4 X-Spam_bar: --- X-Spam_report: (-3.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HK_RANDOM_ENVFROM=0.001, HK_RANDOM_FROM=1, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @intel.com) X-ZM-MESSAGEID: 1752550859182116600 Content-Type: text/plain; charset="utf-8" (This is just the POC code to use gmem with hugetlb.) Try with hugetlb first when hugetlb is supported by gmem. If hugetlb cannot afford the requested memory size and returns -ENOMEM, fallback to create gmem withtout hugetlb. The hugetlb size is hardcoded as GUESTMEM_HUGETLB_FLAG_2MB. I'm not sure if it will be better if gmem can report the supported hugetlb size. But look at the current implementation of memfd, it just tries with the requested hugetlb size from user and fail when not supported. Hence gmem can do the same way without the supported size being enuemrated. For a upstreamable solution, the hugetlb support of gmem can be implemented as "hugetlb" and "hugetlbsize" properties of memory-backend-guestmemfd as similar of memory-backend-memfd. (It requires memory-backed-guestmemfd introduced for in-place conversion gmem at first) Signed-off-by: Xiaoyao Li --- accel/kvm/kvm-all.c | 3 ++- accel/stubs/kvm-stub.c | 1 + include/system/kvm.h | 1 + system/physmem.c | 13 +++++++++++++ 4 files changed, 17 insertions(+), 1 deletion(-) diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index 609537738d38..2d18e961714e 100644 --- a/accel/kvm/kvm-all.c +++ b/accel/kvm/kvm-all.c @@ -106,6 +106,7 @@ static bool kvm_immediate_exit; static uint64_t kvm_supported_memory_attributes; static bool kvm_guest_memfd_supported; bool kvm_guest_memfd_inplace_supported; +bool kvm_guest_memfd_hugetlb_supported; static hwaddr kvm_max_slot_size =3D ~0; =20 static const KVMCapabilityInfo kvm_required_capabilites[] =3D { @@ -2808,6 +2809,7 @@ static int kvm_init(AccelState *as, MachineState *ms) kvm_guest_memfd_inplace_supported =3D kvm_check_extension(s, KVM_CAP_GMEM_SHARED_MEM) && kvm_check_extension(s, KVM_CAP_GMEM_CONVERSION); + kvm_guest_memfd_hugetlb_supported =3D kvm_check_extension(s, KVM_CAP_G= MEM_HUGETLB); kvm_pre_fault_memory_supported =3D kvm_vm_check_extension(s, KVM_CAP_P= RE_FAULT_MEMORY); =20 if (s->kernel_irqchip_split =3D=3D ON_OFF_AUTO_AUTO) { @@ -4536,7 +4538,6 @@ int kvm_create_guest_memfd(uint64_t size, uint64_t fl= ags, Error **errp) fd =3D kvm_vm_ioctl(kvm_state, KVM_CREATE_GUEST_MEMFD, &guest_memfd); if (fd < 0) { error_setg_errno(errp, errno, "Error creating KVM guest_memfd"); - return -1; } =20 return fd; diff --git a/accel/stubs/kvm-stub.c b/accel/stubs/kvm-stub.c index bf0ccae27b62..fbc1d7c4e9b5 100644 --- a/accel/stubs/kvm-stub.c +++ b/accel/stubs/kvm-stub.c @@ -25,6 +25,7 @@ bool kvm_allowed; bool kvm_readonly_mem_allowed; bool kvm_msi_use_devid; bool kvm_guest_memfd_inplace_supported; +bool kvm_guest_memfd_hugetlb_supported; =20 void kvm_flush_coalesced_mmio_buffer(void) { diff --git a/include/system/kvm.h b/include/system/kvm.h index 32f2be5f92e1..d1d79510ee26 100644 --- a/include/system/kvm.h +++ b/include/system/kvm.h @@ -44,6 +44,7 @@ extern bool kvm_readonly_mem_allowed; extern bool kvm_msi_use_devid; extern bool kvm_pre_fault_memory_supported; extern bool kvm_guest_memfd_inplace_supported; +extern bool kvm_guest_memfd_hugetlb_supported; =20 #define kvm_enabled() (kvm_allowed) /** diff --git a/system/physmem.c b/system/physmem.c index 955480685310..ea1c27ea2b99 100644 --- a/system/physmem.c +++ b/system/physmem.c @@ -1940,8 +1940,21 @@ static void ram_block_add(RAMBlock *new_block, Error= **errp) GUEST_MEMFD_FLAG_INIT_PRIVATE; } =20 + if (kvm_guest_memfd_hugetlb_supported) { + new_block->guest_memfd_flags |=3D GUEST_MEMFD_FLAG_HUGETLB | + GUESTMEM_HUGETLB_FLAG_2MB; + } + + new_block->guest_memfd =3D kvm_create_guest_memfd(new_block->max_l= ength, + new_block->guest_memfd_flags, &err); + if (new_block->guest_memfd =3D=3D -ENOMEM) { + error_free(err); + new_block->guest_memfd_flags &=3D ~(GUEST_MEMFD_FLAG_HUGETLB | + GUESTMEM_HUGETLB_FLAG_2MB); + } new_block->guest_memfd =3D kvm_create_guest_memfd(new_block->max_l= ength, new_block->guest_memfd_flags, errp); + if (new_block->guest_memfd < 0) { qemu_mutex_unlock_ramlist(); goto out_free; --=20 2.43.0 From nobody Sat Nov 15 12:16:20 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=intel.com ARC-Seal: i=1; a=rsa-sha256; t=1752550849; cv=none; d=zohomail.com; s=zohoarc; b=cbwYCr3foMnRDd9rSqrgokom73av6tgbkXtc6LpZ91FbbEFBE4W+gasb9xnkhDpXDTJLdnvgs4XLvhrXFk5bUg3bkS95LxmC6IPZbP/kGKuD84YaCSRCEMc8nlCrD6XztKQNhNVDVUm0XPL5xeZfjkkGAa+Nsd3WvgCRR74xae8= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1752550849; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=AP+Jdul14dXIZaGORRzKkQZndl7dtRigt25A3/tcKP0=; b=J870eNLWEtSe6O8fkDI45Zvz26RByaDBjKCB5phiVtJq44et6poq4Gvz5k+qarxnIorgUAuPngN8FLGhY0n88+mD1ySJe1l5ZukSHkYVWKyDNDf7iWWMQdEnN3HHVLiICmRstsHE5DJUEUc4HqF36/XcqtXgqDKEi2nn9p07L1s= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1752550849377503.6230850140644; Mon, 14 Jul 2025 20:40:49 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ubWWc-0004DX-Pg; Mon, 14 Jul 2025 23:40:14 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ubWWb-0004CG-IU for qemu-devel@nongnu.org; Mon, 14 Jul 2025 23:40:13 -0400 Received: from mgamail.intel.com ([192.198.163.8]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ubWWZ-0006Rk-Ej for qemu-devel@nongnu.org; Mon, 14 Jul 2025 23:40:13 -0400 Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by fmvoesa102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Jul 2025 20:40:11 -0700 Received: from lxy-clx-4s.sh.intel.com ([10.239.48.52]) by fmviesa002.fm.intel.com with ESMTP; 14 Jul 2025 20:40:07 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1752550812; x=1784086812; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Qeil6/cS4TnDGn/oSbEZUuBv4NuQFCKMlPyhRgS+pqA=; b=iZVk9GrKRVX3lwddkefiI+bsA3Kw9u4v/j7/4++UTwpAZxF0JCJGSD2l vfNxrd2Il1ccJxqUUXJtPllmFTPJM4NK+QTpBrIj2AfGzIMWN+0evB9l/ kojCUvhK2xc9lzffnPIJPJa0J+mTi01jKc8Y/2hBHWDLZVk4c9D8idiW4 2lYZpcAYzCf3uaDUFmoPQTwTyJOUT/Qgc3m6lQOPWB6rP5pDfrE3s8+5a 02dSe5HsFNRjzPd82Q2DVlnxKNAsZrYpvkQgXeB6KoRC0uXBs1Npkn2Ep Rpy7ZxuAFoMJcKK3OK8IigW+Ohv4z9blm0vEs5ea2ej/bj0HHpapiP2oz g==; X-CSE-ConnectionGUID: yqjRu9P5Rh66xaKYsx4O3Q== X-CSE-MsgGUID: mcmHgc6+SpuQNTMplWypSg== X-IronPort-AV: E=McAfee;i="6800,10657,11491"; a="72334946" X-IronPort-AV: E=Sophos;i="6.16,312,1744095600"; d="scan'208";a="72334946" X-CSE-ConnectionGUID: V/OJWrNDQHSoOfPHxJLXSA== X-CSE-MsgGUID: Zf/J+uF7SEqChZOuPe+G6Q== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,312,1744095600"; d="scan'208";a="180808190" From: Xiaoyao Li To: Paolo Bonzini , David Hildenbrand , ackerleytng@google.com, seanjc@google.com Cc: Fuad Tabba , Vishal Annapurve , rick.p.edgecombe@intel.com, Kai Huang , binbin.wu@linux.intel.com, yan.y.zhao@intel.com, ira.weiny@intel.com, michael.roth@amd.com, kvm@vger.kernel.org, qemu-devel@nongnu.org, Peter Xu , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= Subject: [POC PATCH 5/5] [HACK] memory: Don't enable in-place conversion for internal MemoryRegion with gmem Date: Tue, 15 Jul 2025 11:31:41 +0800 Message-ID: <20250715033141.517457-6-xiaoyao.li@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250715033141.517457-1-xiaoyao.li@intel.com> References: <20250715033141.517457-1-xiaoyao.li@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=192.198.163.8; envelope-from=xiaoyao.li@intel.com; helo=mgamail.intel.com X-Spam_score_int: -33 X-Spam_score: -3.4 X-Spam_bar: --- X-Spam_report: (-3.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HK_RANDOM_ENVFROM=0.001, HK_RANDOM_FROM=1, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @intel.com) X-ZM-MESSAGEID: 1752550851387116600 Content-Type: text/plain; charset="utf-8" Currently, the TDVF cannot work with gmem in-place conversion because current implementation of KVM_TDX_INIT_MEM_REGION in KVM requires gmem of TDVF to be valid for both shared and private at the same time. To workaround it, explicitly not enable in-place conversion for internal MemoryRegion with gmem. So that TDVF doesn't use in-place conversion gmem and KVM_TDX_INIT_MEM_REGION will initialize the gmem with the separate shared memory. To make in-place conversion work with TDX's initial memory, the one possible solution and flow would be as below and it requires KVM change: - QEMU create gmem as shared; - QEMU mmap the gmem and load TDVF binary into it; - QEMU convert gmem to private with the content preserved[1]; - QEMU invokes KVM_TDX_INIT_MEM_REGION without valid src, so that KVM knows to fetch the content in-place and use in-place PAGE.ADD for TDX. [1] https://lore.kernel.org/all/aG0pNijVpl0czqXu@google.com/ Signed-off-by: Xiaoyao Li --- include/system/memory.h | 3 +++ system/memory.c | 2 +- system/physmem.c | 8 +++++--- 3 files changed, 9 insertions(+), 4 deletions(-) diff --git a/include/system/memory.h b/include/system/memory.h index f14fbf65805d..89d6449cef70 100644 --- a/include/system/memory.h +++ b/include/system/memory.h @@ -256,6 +256,9 @@ typedef struct IOMMUTLBEvent { */ #define RAM_PRIVATE (1 << 13) =20 +/* Don't use enable in-place conversion for the guest mmefd backend */ +#define RAM_GUEST_MEMFD_NO_INPLACE (1 << 14) + static inline void iommu_notifier_init(IOMMUNotifier *n, IOMMUNotify fn, IOMMUNotifierFlag flags, hwaddr start, hwaddr end, diff --git a/system/memory.c b/system/memory.c index 6870a41629ef..c1b73abc4c94 100644 --- a/system/memory.c +++ b/system/memory.c @@ -3702,7 +3702,7 @@ bool memory_region_init_ram_guest_memfd(MemoryRegion = *mr, DeviceState *owner_dev; =20 if (!memory_region_init_ram_flags_nomigrate(mr, owner, name, size, - RAM_GUEST_MEMFD, errp)) { + RAM_GUEST_MEMFD | RAM_GUES= T_MEMFD_NO_INPLACE, errp)) { return false; } /* This will assert if owner is neither NULL nor a DeviceState. diff --git a/system/physmem.c b/system/physmem.c index ea1c27ea2b99..c23379082f38 100644 --- a/system/physmem.c +++ b/system/physmem.c @@ -1916,7 +1916,8 @@ static void ram_block_add(RAMBlock *new_block, Error = **errp) =20 if (new_block->flags & RAM_GUEST_MEMFD) { int ret; - bool in_place =3D kvm_guest_memfd_inplace_supported; + bool in_place =3D !(new_block->flags & RAM_GUEST_MEMFD_NO_INPLACE)= && + kvm_guest_memfd_inplace_supported; =20 new_block->guest_memfd_flags =3D 0; =20 @@ -2230,7 +2231,8 @@ RAMBlock *qemu_ram_alloc_internal(ram_addr_t size, ra= m_addr_t max_size, ram_flags &=3D ~RAM_PRIVATE; =20 assert((ram_flags & ~(RAM_SHARED | RAM_RESIZEABLE | RAM_PREALLOC | - RAM_NORESERVE | RAM_GUEST_MEMFD)) =3D=3D 0); + RAM_NORESERVE | RAM_GUEST_MEMFD | + RAM_GUEST_MEMFD_NO_INPLACE)) =3D=3D 0); assert(!host ^ (ram_flags & RAM_PREALLOC)); assert(max_size >=3D size); =20 @@ -2314,7 +2316,7 @@ RAMBlock *qemu_ram_alloc(ram_addr_t size, uint32_t ra= m_flags, MemoryRegion *mr, Error **errp) { assert((ram_flags & ~(RAM_SHARED | RAM_NORESERVE | RAM_GUEST_MEMFD | - RAM_PRIVATE)) =3D=3D 0); + RAM_PRIVATE | RAM_GUEST_MEMFD_NO_INPLACE)) =3D= =3D 0); return qemu_ram_alloc_internal(size, size, NULL, NULL, ram_flags, mr, = errp); } =20 --=20 2.43.0