From nobody Sat Nov 15 17:58:29 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=intel.com ARC-Seal: i=1; a=rsa-sha256; t=1748594087; cv=none; d=zohomail.com; s=zohoarc; b=cTGgJcO3Lbpu02add+pLfasMnkQnZaMbe/Q8YxSQdiVeIiAIBTE5bJFKI3TlyuJIb4cskD9xFpWmdtm7JoYnWy1cyaHqOS2AWlgD90LYd71gmW8GPW+8kSZit+hC2HR3CLY7GHXdbCbRMaQ6L6ZNU2yMVCcEv+FAxRIErNJXXZw= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1748594087; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=NRQDQ55rI3EjRYgiOtfYjIChYIQCiGDdVbAwacxGPxo=; b=DvfChrKphSUhvXFTMZd+BJhy7pRw9vobyRrwbaaSAW8D+UyRQLz6FVMlMlMlx8QWgTZhO7X44+kBsm2pvW8z0KyCovJxEg0OAM6jhnOr3gCHpAnWXbB6nvYFvIHEQJfgyjg0SQOlpn6pSmhFrSgFThMqr+//5k/nfygoMdqHhXw= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1748594087261857.7558460172136; Fri, 30 May 2025 01:34:47 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1uKvBq-0007Pc-6k; Fri, 30 May 2025 04:34:10 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uKvBm-0007P1-94 for qemu-devel@nongnu.org; Fri, 30 May 2025 04:34:06 -0400 Received: from mgamail.intel.com ([198.175.65.12]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uKvBj-0007rn-Oc for qemu-devel@nongnu.org; Fri, 30 May 2025 04:34:06 -0400 Received: from orviesa007.jf.intel.com ([10.64.159.147]) by orvoesa104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 May 2025 01:34:02 -0700 Received: from emr-bkc.sh.intel.com ([10.112.230.82]) by orviesa007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 May 2025 01:33:58 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1748594044; x=1780130044; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ErOx1YGbQNtXWKSFT1x2vQVQKaLrP+c6flBUhQ6KGOg=; b=APWg8KRuutxQK1I/QBFnEdq6uz/2rhikSWTfF0w0uzlKNEyFP1RvfkKn SyjXVXpHNfazgVmwjfElQ36j1DvBsCJj+VFYeElxLkWvZs6nMrmngNI9H pGmHmLhgPKJ4wZrZnB4gKnBr0DEhIg7K38OIUiJsnKA4ghVJtIAvNpTHV +5uAv7ZOktRRLTTXe+lDsSW5pDXTv9+1jpe55PJE61XTlmH3Sqco7Kc/l LX6ZVSDsBiOMX66GWRbW9UbpwJ1gjU5N5KVA2skMqDW4XF2f0dRNn9Jww I5aC+Urm1rvTyR5FAQHa8W0l8hwxo2HG7+YRqY+rGNJ2mIRxf7/Zgs4VD g==; X-CSE-ConnectionGUID: YKjgqwIpRYWDOXbjAJt4xQ== X-CSE-MsgGUID: m1BPHZVCSpyzrypFvv9Xcg== X-IronPort-AV: E=McAfee;i="6700,10204,11448"; a="62081500" X-IronPort-AV: E=Sophos;i="6.16,195,1744095600"; d="scan'208";a="62081500" X-CSE-ConnectionGUID: 7bfgkNDAQ82bV8T88t+zEA== X-CSE-MsgGUID: WnjLLr2aQxO5SjuTXVXhyQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,195,1744095600"; d="scan'208";a="144453759" From: Chenyi Qiang To: David Hildenbrand , Alexey Kardashevskiy , Peter Xu , Gupta Pankaj , Paolo Bonzini , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= , Michael Roth Cc: Chenyi Qiang , qemu-devel@nongnu.org, kvm@vger.kernel.org, Williams Dan J , Zhao Liu , Baolu Lu , Gao Chao , Xu Yilun , Li Xiaoyao , =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= , Alex Williamson Subject: [PATCH v6 1/5] memory: Export a helper to get intersection of a MemoryRegionSection with a given range Date: Fri, 30 May 2025 16:32:50 +0800 Message-ID: <20250530083256.105186-2-chenyi.qiang@intel.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250530083256.105186-1-chenyi.qiang@intel.com> References: <20250530083256.105186-1-chenyi.qiang@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=198.175.65.12; envelope-from=chenyi.qiang@intel.com; helo=mgamail.intel.com X-Spam_score_int: -72 X-Spam_score: -7.3 X-Spam_bar: ------- X-Spam_report: (-7.3 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-2.902, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @intel.com) X-ZM-MESSAGEID: 1748594089976116600 Content-Type: text/plain; charset="utf-8" Rename the helper to memory_region_section_intersect_range() to make it more generic. Meanwhile, define the @end as Int128 and replace the related operations with Int128_* format since the helper is exported as a wider API. Suggested-by: Alexey Kardashevskiy Reviewed-by: Alexey Kardashevskiy Reviewed-by: David Hildenbrand Reviewed-by: Zhao Liu Signed-off-by: Chenyi Qiang Reviewed-by: Pankaj Gupta Reviewed-by: Xiaoyao Li --- Changes in v6: - No change. Changes in v5: - Indent change for int128 ops to avoid the line over 80 - Add two Review-by from Alexey and Zhao Changes in v4: - No change. --- hw/virtio/virtio-mem.c | 32 +++++--------------------------- include/system/memory.h | 30 ++++++++++++++++++++++++++++++ 2 files changed, 35 insertions(+), 27 deletions(-) diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c index a3d1a676e7..b3c126ea1e 100644 --- a/hw/virtio/virtio-mem.c +++ b/hw/virtio/virtio-mem.c @@ -244,28 +244,6 @@ static int virtio_mem_for_each_plugged_range(VirtIOMEM= *vmem, void *arg, return ret; } =20 -/* - * Adjust the memory section to cover the intersection with the given rang= e. - * - * Returns false if the intersection is empty, otherwise returns true. - */ -static bool virtio_mem_intersect_memory_section(MemoryRegionSection *s, - uint64_t offset, uint64_t = size) -{ - uint64_t start =3D MAX(s->offset_within_region, offset); - uint64_t end =3D MIN(s->offset_within_region + int128_get64(s->size), - offset + size); - - if (end <=3D start) { - return false; - } - - s->offset_within_address_space +=3D start - s->offset_within_region; - s->offset_within_region =3D start; - s->size =3D int128_make64(end - start); - return true; -} - typedef int (*virtio_mem_section_cb)(MemoryRegionSection *s, void *arg); =20 static int virtio_mem_for_each_plugged_section(const VirtIOMEM *vmem, @@ -287,7 +265,7 @@ static int virtio_mem_for_each_plugged_section(const Vi= rtIOMEM *vmem, first_bit + 1) - 1; size =3D (last_bit - first_bit + 1) * vmem->block_size; =20 - if (!virtio_mem_intersect_memory_section(&tmp, offset, size)) { + if (!memory_region_section_intersect_range(&tmp, offset, size)) { break; } ret =3D cb(&tmp, arg); @@ -319,7 +297,7 @@ static int virtio_mem_for_each_unplugged_section(const = VirtIOMEM *vmem, first_bit + 1) - 1; size =3D (last_bit - first_bit + 1) * vmem->block_size; =20 - if (!virtio_mem_intersect_memory_section(&tmp, offset, size)) { + if (!memory_region_section_intersect_range(&tmp, offset, size)) { break; } ret =3D cb(&tmp, arg); @@ -355,7 +333,7 @@ static void virtio_mem_notify_unplug(VirtIOMEM *vmem, u= int64_t offset, QLIST_FOREACH(rdl, &vmem->rdl_list, next) { MemoryRegionSection tmp =3D *rdl->section; =20 - if (!virtio_mem_intersect_memory_section(&tmp, offset, size)) { + if (!memory_region_section_intersect_range(&tmp, offset, size)) { continue; } rdl->notify_discard(rdl, &tmp); @@ -371,7 +349,7 @@ static int virtio_mem_notify_plug(VirtIOMEM *vmem, uint= 64_t offset, QLIST_FOREACH(rdl, &vmem->rdl_list, next) { MemoryRegionSection tmp =3D *rdl->section; =20 - if (!virtio_mem_intersect_memory_section(&tmp, offset, size)) { + if (!memory_region_section_intersect_range(&tmp, offset, size)) { continue; } ret =3D rdl->notify_populate(rdl, &tmp); @@ -388,7 +366,7 @@ static int virtio_mem_notify_plug(VirtIOMEM *vmem, uint= 64_t offset, if (rdl2 =3D=3D rdl) { break; } - if (!virtio_mem_intersect_memory_section(&tmp, offset, size)) { + if (!memory_region_section_intersect_range(&tmp, offset, size)= ) { continue; } rdl2->notify_discard(rdl2, &tmp); diff --git a/include/system/memory.h b/include/system/memory.h index fbbf4cf911..b961c4076a 100644 --- a/include/system/memory.h +++ b/include/system/memory.h @@ -1211,6 +1211,36 @@ MemoryRegionSection *memory_region_section_new_copy(= MemoryRegionSection *s); */ void memory_region_section_free_copy(MemoryRegionSection *s); =20 +/** + * memory_region_section_intersect_range: Adjust the memory section to cov= er + * the intersection with the given range. + * + * @s: the #MemoryRegionSection to be adjusted + * @offset: the offset of the given range in the memory region + * @size: the size of the given range + * + * Returns false if the intersection is empty, otherwise returns true. + */ +static inline bool memory_region_section_intersect_range(MemoryRegionSecti= on *s, + uint64_t offset, + uint64_t size) +{ + uint64_t start =3D MAX(s->offset_within_region, offset); + Int128 end =3D int128_min(int128_add(int128_make64(s->offset_within_re= gion), + s->size), + int128_add(int128_make64(offset), + int128_make64(size))); + + if (int128_le(end, int128_make64(start))) { + return false; + } + + s->offset_within_address_space +=3D start - s->offset_within_region; + s->offset_within_region =3D start; + s->size =3D int128_sub(end, int128_make64(start)); + return true; +} + /** * memory_region_init: Initialize a memory region * --=20 2.43.5 From nobody Sat Nov 15 17:58:29 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=intel.com ARC-Seal: i=1; a=rsa-sha256; t=1748594087; cv=none; d=zohomail.com; s=zohoarc; b=I54tUNQSn5hydyDMiLPJBSpzMkBFFgrDKRolaIt7GgAfeFQ2jdXdxCHq4tUIPr5g/9oNR/NdUJx90T/mBSum906SdtAjym+qys0z3oflpkXjZBL1veW32UMd8gvXarsf7jJoqV7EN4VpIopYsul+O9ceyT6PJloAwu/MuYN+RYE= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1748594087; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=a6RwVUMpgxlEBBOBe3zOfgW4kpi03OejETsYo+bOH5o=; b=XXw+PDWWPgkF3BoT3sYXv2ig+3t9MhnilSEYZn3ki3l0ZWdH8rmxTJliYub92id2TRDuCqguz93w1sOLnFh5eHqbcp+DGBOB/RzBKGgCMtq2zVxyvA0jfkJ771dwebxZ01vIOQLUoHt4Du3viqaSbwJnxhOR0j1IcYfbjcLGwY0= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1748594087444707.3191781931514; Fri, 30 May 2025 01:34:47 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1uKvBx-0007Wk-U3; Fri, 30 May 2025 04:34:17 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uKvBp-0007Pj-SF for qemu-devel@nongnu.org; Fri, 30 May 2025 04:34:10 -0400 Received: from mgamail.intel.com ([198.175.65.12]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uKvBn-0007t2-VG for qemu-devel@nongnu.org; Fri, 30 May 2025 04:34:09 -0400 Received: from orviesa007.jf.intel.com ([10.64.159.147]) by orvoesa104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 May 2025 01:34:06 -0700 Received: from emr-bkc.sh.intel.com ([10.112.230.82]) by orviesa007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 May 2025 01:34:02 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1748594048; x=1780130048; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=smTvk2thoVhe2rnRJ8+q/8EMMQcaZQ8DftZZKO2TQSg=; b=LcmJqQknyEUuct+4a+8WJGbWeOZb+sxFmP/nV8T8wRysLxMTGAT0fz2J cZ4VIi69bTOf13OnCVtxvd/FUX9xichiRPsETILJJxp43g1gRphfwSiN5 6t2Ce3rlqZcd/x2HipLWTmEDWjphpSjMQ5cG0liy59VLX3Pl4dIe57Adb O9bxxbsaGafAsJU/5J9F/+huM4NCW35u6ibOVVcPrpvi/orKnGDpkyKfC KXnwpKpdubDuDYoAmDzM/K4ejS+IWkXyJIqptp6B3Zr/r8WhgLKjVfTr8 b41jwbOK8PKjacoaf4Ks2a7z3HsLCaJkM4RB38r+dplFGUCLtorkwf/GJ g==; X-CSE-ConnectionGUID: 5k+AwhUnR4WFYHh24yQuHw== X-CSE-MsgGUID: rQ29uxEcRwmbXVyXQSfoQw== X-IronPort-AV: E=McAfee;i="6700,10204,11448"; a="62081515" X-IronPort-AV: E=Sophos;i="6.16,195,1744095600"; d="scan'208";a="62081515" X-CSE-ConnectionGUID: JiSvmQtWQJWnCBMHfGYkVw== X-CSE-MsgGUID: VDjdUpRhSFudfhFgvQPHbQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,195,1744095600"; d="scan'208";a="144453767" From: Chenyi Qiang To: David Hildenbrand , Alexey Kardashevskiy , Peter Xu , Gupta Pankaj , Paolo Bonzini , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= , Michael Roth Cc: Chenyi Qiang , qemu-devel@nongnu.org, kvm@vger.kernel.org, Williams Dan J , Zhao Liu , Baolu Lu , Gao Chao , Xu Yilun , Li Xiaoyao , =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= , Alex Williamson Subject: [PATCH v6 2/5] memory: Change memory_region_set_ram_discard_manager() to return the result Date: Fri, 30 May 2025 16:32:51 +0800 Message-ID: <20250530083256.105186-3-chenyi.qiang@intel.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250530083256.105186-1-chenyi.qiang@intel.com> References: <20250530083256.105186-1-chenyi.qiang@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=198.175.65.12; envelope-from=chenyi.qiang@intel.com; helo=mgamail.intel.com X-Spam_score_int: -72 X-Spam_score: -7.3 X-Spam_bar: ------- X-Spam_report: (-7.3 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-2.902, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @intel.com) X-ZM-MESSAGEID: 1748594089902116600 Content-Type: text/plain; charset="utf-8" Modify memory_region_set_ram_discard_manager() to return -EBUSY if a RamDiscardManager is already set in the MemoryRegion. The caller must handle this failure, such as having virtio-mem undo its actions and fail the realize() process. Opportunistically move the call earlier to avoid complex error handling. This change is beneficial when introducing a new RamDiscardManager instance besides virtio-mem. After ram_block_coordinated_discard_require(true) unlocks all RamDiscardManager instances, only one instance is allowed to be set for one MemoryRegion at present. Suggested-by: David Hildenbrand Reviewed-by: David Hildenbrand Signed-off-by: Chenyi Qiang Reviewed-by: Pankaj Gupta Reviewed-by: Xiaoyao Li --- Changes in v6: - Add Reviewed-by from David. Changes in v5: - Nit in commit message (return false -> -EBUSY) - Add set_ram_discard_manager(NULL) when ram_block_discard_range() fails. Changes in v3: - Move set_ram_discard_manager() up to avoid a g_free() - Clean up set_ram_discard_manager() definition --- hw/virtio/virtio-mem.c | 30 +++++++++++++++++------------- include/system/memory.h | 6 +++--- system/memory.c | 10 +++++++--- 3 files changed, 27 insertions(+), 19 deletions(-) diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c index b3c126ea1e..2e491e8c44 100644 --- a/hw/virtio/virtio-mem.c +++ b/hw/virtio/virtio-mem.c @@ -1047,6 +1047,17 @@ static void virtio_mem_device_realize(DeviceState *d= ev, Error **errp) return; } =20 + /* + * Set ourselves as RamDiscardManager before the plug handler maps the + * memory region and exposes it via an address space. + */ + if (memory_region_set_ram_discard_manager(&vmem->memdev->mr, + RAM_DISCARD_MANAGER(vmem))) { + error_setg(errp, "Failed to set RamDiscardManager"); + ram_block_coordinated_discard_require(false); + return; + } + /* * We don't know at this point whether shared RAM is migrated using * QEMU or migrated using the file content. "x-ignore-shared" will be @@ -1061,6 +1072,7 @@ static void virtio_mem_device_realize(DeviceState *de= v, Error **errp) ret =3D ram_block_discard_range(rb, 0, qemu_ram_get_used_length(rb= )); if (ret) { error_setg_errno(errp, -ret, "Unexpected error discarding RAM"= ); + memory_region_set_ram_discard_manager(&vmem->memdev->mr, NULL); ram_block_coordinated_discard_require(false); return; } @@ -1122,13 +1134,6 @@ static void virtio_mem_device_realize(DeviceState *d= ev, Error **errp) vmem->system_reset =3D VIRTIO_MEM_SYSTEM_RESET(obj); vmem->system_reset->vmem =3D vmem; qemu_register_resettable(obj); - - /* - * Set ourselves as RamDiscardManager before the plug handler maps the - * memory region and exposes it via an address space. - */ - memory_region_set_ram_discard_manager(&vmem->memdev->mr, - RAM_DISCARD_MANAGER(vmem)); } =20 static void virtio_mem_device_unrealize(DeviceState *dev) @@ -1136,12 +1141,6 @@ static void virtio_mem_device_unrealize(DeviceState = *dev) VirtIODevice *vdev =3D VIRTIO_DEVICE(dev); VirtIOMEM *vmem =3D VIRTIO_MEM(dev); =20 - /* - * The unplug handler unmapped the memory region, it cannot be - * found via an address space anymore. Unset ourselves. - */ - memory_region_set_ram_discard_manager(&vmem->memdev->mr, NULL); - qemu_unregister_resettable(OBJECT(vmem->system_reset)); object_unref(OBJECT(vmem->system_reset)); =20 @@ -1154,6 +1153,11 @@ static void virtio_mem_device_unrealize(DeviceState = *dev) virtio_del_queue(vdev, 0); virtio_cleanup(vdev); g_free(vmem->bitmap); + /* + * The unplug handler unmapped the memory region, it cannot be + * found via an address space anymore. Unset ourselves. + */ + memory_region_set_ram_discard_manager(&vmem->memdev->mr, NULL); ram_block_coordinated_discard_require(false); } =20 diff --git a/include/system/memory.h b/include/system/memory.h index b961c4076a..896948deb1 100644 --- a/include/system/memory.h +++ b/include/system/memory.h @@ -2499,13 +2499,13 @@ static inline bool memory_region_has_ram_discard_ma= nager(MemoryRegion *mr) * * This function must not be called for a mapped #MemoryRegion, a #MemoryR= egion * that does not cover RAM, or a #MemoryRegion that already has a - * #RamDiscardManager assigned. + * #RamDiscardManager assigned. Return 0 if the rdm is set successfully. * * @mr: the #MemoryRegion * @rdm: #RamDiscardManager to set */ -void memory_region_set_ram_discard_manager(MemoryRegion *mr, - RamDiscardManager *rdm); +int memory_region_set_ram_discard_manager(MemoryRegion *mr, + RamDiscardManager *rdm); =20 /** * memory_region_find: translate an address/size relative to a diff --git a/system/memory.c b/system/memory.c index 63b983efcd..b45b508dce 100644 --- a/system/memory.c +++ b/system/memory.c @@ -2106,12 +2106,16 @@ RamDiscardManager *memory_region_get_ram_discard_ma= nager(MemoryRegion *mr) return mr->rdm; } =20 -void memory_region_set_ram_discard_manager(MemoryRegion *mr, - RamDiscardManager *rdm) +int memory_region_set_ram_discard_manager(MemoryRegion *mr, + RamDiscardManager *rdm) { g_assert(memory_region_is_ram(mr)); - g_assert(!rdm || !mr->rdm); + if (mr->rdm && rdm) { + return -EBUSY; + } + mr->rdm =3D rdm; + return 0; } =20 uint64_t ram_discard_manager_get_min_granularity(const RamDiscardManager *= rdm, --=20 2.43.5 From nobody Sat Nov 15 17:58:29 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=intel.com ARC-Seal: i=1; a=rsa-sha256; t=1748594120; cv=none; d=zohomail.com; s=zohoarc; b=hL7ZrqYjMnE7ICLJ1PyDsIkYcHuH77mp6KTxG+B78LVaCQjJDdSwdWvR8l/rlSWZ3kxgvybDLT+cWpD9ux9Ng5Oyi/L8iOCn1plov3HFYkH7QMb78x7LSJT0z9xwjolb/JaeJ6hCYhc85p3PWskEOA5xBUO5Nu+OdR40kt0YBp0= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1748594120; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=c9mJu+sJlfkqOJ/q33Lls8No66qMIRrpts0iRL2eczA=; b=Udgn3nIyKtN/RzlIRyFarMHWcfs+ohN5qOlibKNMcnz6/AVy2fRcS2OUq0PO2qDurSsGQ736qT+pJE7kwEQ90D75d/jszc9k8USIP1BlLCptHOHY22zOlsvhw7/MSk8AybsoJ96Abda1SmLFDXTI9gUCw3gtOOF2WWtO+jQ8lFQ= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1748594120505771.2055038999226; Fri, 30 May 2025 01:35:20 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1uKvC8-0007ZW-7Y; Fri, 30 May 2025 04:34:28 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uKvC5-0007YS-An for qemu-devel@nongnu.org; Fri, 30 May 2025 04:34:25 -0400 Received: from mgamail.intel.com ([198.175.65.12]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uKvC2-0007uI-7F for qemu-devel@nongnu.org; Fri, 30 May 2025 04:34:25 -0400 Received: from orviesa007.jf.intel.com ([10.64.159.147]) by orvoesa104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 May 2025 01:34:11 -0700 Received: from emr-bkc.sh.intel.com ([10.112.230.82]) by orviesa007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 May 2025 01:34:06 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1748594062; x=1780130062; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=XatKS558Mh6syXhI1KMlrAGiYwh56ZXPM1XSqMYJ8dk=; b=QQIbNmoktu5D9Jyr7wSQvezhWTXsHNAr/2VNrznfboWTPw4lojaOsMvG bhlE3RRn7pXoylJCYDFMimzxJaLeb47QV2IbMnezM2wuxklXUpLBrCgDE cm8+iRzI6FejeMO9WMrotUiD0rjXuI7m0PRd1OQHERc1krfmLHGeRGSnf sq0egZQX7izI4usOum6VaIoq6x8X9ZSMZC4Lg8vEDguaLI5wNlryAU0Bt Wu9JC6bQ4kbOlDXOInxpieRyRGJRLpSmHkg6Vgo69/NTRMqt4aMdvfodr 1HhubYR/om/3qjoquGTXN1n+g1+hmAm4aIYeD5X/L6CPjvjfDUDSfBZ4e g==; X-CSE-ConnectionGUID: ggSYwf6tSCGKpXtDBwwgMQ== X-CSE-MsgGUID: 8FtPiEDZTU62xQaFLvNTxA== X-IronPort-AV: E=McAfee;i="6700,10204,11448"; a="62081529" X-IronPort-AV: E=Sophos;i="6.16,195,1744095600"; d="scan'208";a="62081529" X-CSE-ConnectionGUID: k2XMCxBnRXK7gfmR6ASyJA== X-CSE-MsgGUID: elzbCLiIQvihOcqx2LA1Fw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,195,1744095600"; d="scan'208";a="144453778" From: Chenyi Qiang To: David Hildenbrand , Alexey Kardashevskiy , Peter Xu , Gupta Pankaj , Paolo Bonzini , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= , Michael Roth Cc: Chenyi Qiang , qemu-devel@nongnu.org, kvm@vger.kernel.org, Williams Dan J , Zhao Liu , Baolu Lu , Gao Chao , Xu Yilun , Li Xiaoyao , =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= , Alex Williamson Subject: [PATCH v6 3/5] memory: Unify the definiton of ReplayRamPopulate() and ReplayRamDiscard() Date: Fri, 30 May 2025 16:32:52 +0800 Message-ID: <20250530083256.105186-4-chenyi.qiang@intel.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250530083256.105186-1-chenyi.qiang@intel.com> References: <20250530083256.105186-1-chenyi.qiang@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=198.175.65.12; envelope-from=chenyi.qiang@intel.com; helo=mgamail.intel.com X-Spam_score_int: -72 X-Spam_score: -7.3 X-Spam_bar: ------- X-Spam_report: (-7.3 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-2.902, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @intel.com) X-ZM-MESSAGEID: 1748594122279116600 Content-Type: text/plain; charset="utf-8" Update ReplayRamDiscard() function to return the result and unify the ReplayRamPopulate() and ReplayRamDiscard() to ReplayRamDiscardState() at the same time due to their identical definitions. This unification simplifies related structures, such as VirtIOMEMReplayData, which makes it cleaner. Reviewed-by: David Hildenbrand Signed-off-by: Chenyi Qiang Reviewed-by: Pankaj Gupta Reviewed-by: Xiaoyao Li --- Changes in v6: - Add Reviewed-by from David - Add a documentation comment for the prototype change Changes in v5: - Rename ReplayRamStateChange to ReplayRamDiscardState (David) - return data->fn(s, data->opaque) instead of 0 in virtio_mem_rdm_replay_discarded_cb(). (Alexey) Changes in v4: - Modify the commit message. We won't use Replay() operation when doing the attribute change like v3. --- hw/virtio/virtio-mem.c | 21 +++++++------- include/system/memory.h | 64 ++++++++++++++++++++++++++++++----------- migration/ram.c | 5 ++-- system/memory.c | 12 ++++---- 4 files changed, 66 insertions(+), 36 deletions(-) diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c index 2e491e8c44..c46f6f9c3e 100644 --- a/hw/virtio/virtio-mem.c +++ b/hw/virtio/virtio-mem.c @@ -1732,7 +1732,7 @@ static bool virtio_mem_rdm_is_populated(const RamDisc= ardManager *rdm, } =20 struct VirtIOMEMReplayData { - void *fn; + ReplayRamDiscardState fn; void *opaque; }; =20 @@ -1740,12 +1740,12 @@ static int virtio_mem_rdm_replay_populated_cb(Memor= yRegionSection *s, void *arg) { struct VirtIOMEMReplayData *data =3D arg; =20 - return ((ReplayRamPopulate)data->fn)(s, data->opaque); + return data->fn(s, data->opaque); } =20 static int virtio_mem_rdm_replay_populated(const RamDiscardManager *rdm, MemoryRegionSection *s, - ReplayRamPopulate replay_fn, + ReplayRamDiscardState replay_fn, void *opaque) { const VirtIOMEM *vmem =3D VIRTIO_MEM(rdm); @@ -1764,14 +1764,13 @@ static int virtio_mem_rdm_replay_discarded_cb(Memor= yRegionSection *s, { struct VirtIOMEMReplayData *data =3D arg; =20 - ((ReplayRamDiscard)data->fn)(s, data->opaque); - return 0; + return data->fn(s, data->opaque); } =20 -static void virtio_mem_rdm_replay_discarded(const RamDiscardManager *rdm, - MemoryRegionSection *s, - ReplayRamDiscard replay_fn, - void *opaque) +static int virtio_mem_rdm_replay_discarded(const RamDiscardManager *rdm, + MemoryRegionSection *s, + ReplayRamDiscardState replay_fn, + void *opaque) { const VirtIOMEM *vmem =3D VIRTIO_MEM(rdm); struct VirtIOMEMReplayData data =3D { @@ -1780,8 +1779,8 @@ static void virtio_mem_rdm_replay_discarded(const Ram= DiscardManager *rdm, }; =20 g_assert(s->mr =3D=3D &vmem->memdev->mr); - virtio_mem_for_each_unplugged_section(vmem, s, &data, - virtio_mem_rdm_replay_discarded_= cb); + return virtio_mem_for_each_unplugged_section(vmem, s, &data, + virtio_mem_rdm_replay_dis= carded_cb); } =20 static void virtio_mem_rdm_register_listener(RamDiscardManager *rdm, diff --git a/include/system/memory.h b/include/system/memory.h index 896948deb1..4f45a187d6 100644 --- a/include/system/memory.h +++ b/include/system/memory.h @@ -575,8 +575,20 @@ static inline void ram_discard_listener_init(RamDiscar= dListener *rdl, rdl->double_discard_supported =3D double_discard_supported; } =20 -typedef int (*ReplayRamPopulate)(MemoryRegionSection *section, void *opaqu= e); -typedef void (*ReplayRamDiscard)(MemoryRegionSection *section, void *opaqu= e); +/** + * ReplayRamDiscardState: + * + * The callback handler for #RamDiscardManagerClass.replay_populated/ + * #RamDiscardManagerClass.replay_discarded to invoke on populated/discard= ed + * parts. + * + * @section: the #MemoryRegionSection of populated/discarded part + * @opaque: pointer to forward to the callback + * + * Returns 0 on success, or a negative error if failed. + */ +typedef int (*ReplayRamDiscardState)(MemoryRegionSection *section, + void *opaque); =20 /* * RamDiscardManagerClass: @@ -650,36 +662,38 @@ struct RamDiscardManagerClass { /** * @replay_populated: * - * Call the #ReplayRamPopulate callback for all populated parts within= the - * #MemoryRegionSection via the #RamDiscardManager. + * Call the #ReplayRamDiscardState callback for all populated parts wi= thin + * the #MemoryRegionSection via the #RamDiscardManager. * * In case any call fails, no further calls are made. * * @rdm: the #RamDiscardManager * @section: the #MemoryRegionSection - * @replay_fn: the #ReplayRamPopulate callback + * @replay_fn: the #ReplayRamDiscardState callback * @opaque: pointer to forward to the callback * * Returns 0 on success, or a negative error if any notification faile= d. */ int (*replay_populated)(const RamDiscardManager *rdm, MemoryRegionSection *section, - ReplayRamPopulate replay_fn, void *opaque); + ReplayRamDiscardState replay_fn, void *opaque); =20 /** * @replay_discarded: * - * Call the #ReplayRamDiscard callback for all discarded parts within = the - * #MemoryRegionSection via the #RamDiscardManager. + * Call the #ReplayRamDiscardState callback for all discarded parts wi= thin + * the #MemoryRegionSection via the #RamDiscardManager. * * @rdm: the #RamDiscardManager * @section: the #MemoryRegionSection - * @replay_fn: the #ReplayRamDiscard callback + * @replay_fn: the #ReplayRamDiscardState callback * @opaque: pointer to forward to the callback + * + * Returns 0 on success, or a negative error if any notification faile= d. */ - void (*replay_discarded)(const RamDiscardManager *rdm, - MemoryRegionSection *section, - ReplayRamDiscard replay_fn, void *opaque); + int (*replay_discarded)(const RamDiscardManager *rdm, + MemoryRegionSection *section, + ReplayRamDiscardState replay_fn, void *opaque); =20 /** * @register_listener: @@ -720,15 +734,31 @@ uint64_t ram_discard_manager_get_min_granularity(cons= t RamDiscardManager *rdm, bool ram_discard_manager_is_populated(const RamDiscardManager *rdm, const MemoryRegionSection *section); =20 +/** + * ram_discard_manager_replay_populated: + * + * A wrapper to call the #RamDiscardManagerClass.replay_populated callback + * of the #RamDiscardManager. + * + * Returns 0 on success, or a negative error if any notification failed. + */ int ram_discard_manager_replay_populated(const RamDiscardManager *rdm, MemoryRegionSection *section, - ReplayRamPopulate replay_fn, + ReplayRamDiscardState replay_fn, void *opaque); =20 -void ram_discard_manager_replay_discarded(const RamDiscardManager *rdm, - MemoryRegionSection *section, - ReplayRamDiscard replay_fn, - void *opaque); +/** + * ram_discard_manager_replay_discarded: + * + * A wrapper to call the #RamDiscardManagerClass.replay_discarded callback + * of the #RamDiscardManager. + * + * Returns 0 on success, or a negative error if any notification failed. + */ +int ram_discard_manager_replay_discarded(const RamDiscardManager *rdm, + MemoryRegionSection *section, + ReplayRamDiscardState replay_fn, + void *opaque); =20 void ram_discard_manager_register_listener(RamDiscardManager *rdm, RamDiscardListener *rdl, diff --git a/migration/ram.c b/migration/ram.c index e12913b43e..c004f37060 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -848,8 +848,8 @@ static inline bool migration_bitmap_clear_dirty(RAMStat= e *rs, return ret; } =20 -static void dirty_bitmap_clear_section(MemoryRegionSection *section, - void *opaque) +static int dirty_bitmap_clear_section(MemoryRegionSection *section, + void *opaque) { const hwaddr offset =3D section->offset_within_region; const hwaddr size =3D int128_get64(section->size); @@ -868,6 +868,7 @@ static void dirty_bitmap_clear_section(MemoryRegionSect= ion *section, } *cleared_bits +=3D bitmap_count_one_with_offset(rb->bmap, start, npage= s); bitmap_clear(rb->bmap, start, npages); + return 0; } =20 /* diff --git a/system/memory.c b/system/memory.c index b45b508dce..de45fbdd3f 100644 --- a/system/memory.c +++ b/system/memory.c @@ -2138,7 +2138,7 @@ bool ram_discard_manager_is_populated(const RamDiscar= dManager *rdm, =20 int ram_discard_manager_replay_populated(const RamDiscardManager *rdm, MemoryRegionSection *section, - ReplayRamPopulate replay_fn, + ReplayRamDiscardState replay_fn, void *opaque) { RamDiscardManagerClass *rdmc =3D RAM_DISCARD_MANAGER_GET_CLASS(rdm); @@ -2147,15 +2147,15 @@ int ram_discard_manager_replay_populated(const RamD= iscardManager *rdm, return rdmc->replay_populated(rdm, section, replay_fn, opaque); } =20 -void ram_discard_manager_replay_discarded(const RamDiscardManager *rdm, - MemoryRegionSection *section, - ReplayRamDiscard replay_fn, - void *opaque) +int ram_discard_manager_replay_discarded(const RamDiscardManager *rdm, + MemoryRegionSection *section, + ReplayRamDiscardState replay_fn, + void *opaque) { RamDiscardManagerClass *rdmc =3D RAM_DISCARD_MANAGER_GET_CLASS(rdm); =20 g_assert(rdmc->replay_discarded); - rdmc->replay_discarded(rdm, section, replay_fn, opaque); + return rdmc->replay_discarded(rdm, section, replay_fn, opaque); } =20 void ram_discard_manager_register_listener(RamDiscardManager *rdm, --=20 2.43.5 From nobody Sat Nov 15 17:58:29 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=intel.com ARC-Seal: i=1; a=rsa-sha256; t=1748594094; cv=none; d=zohomail.com; s=zohoarc; b=OY484QqTKuIgS3kRjvKCC4rE1qDVTIJwePTdA5YmOBeLMtGFt5CJvmen3wunpTgOg0XFtrn2MXnV/a3Tmr0mg/6nq3Guh4O1B1BB6VwTsmmoNLfI8fMydq25BH89N9N95EdpiBybQM/8G7TZdM1TNKljXMlub+M3KMo41lGoDQY= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1748594094; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=8fEzW1Hc8EFkX8ZR+MQdsqoP2ddWHTjgdq8NrpXAUvo=; b=PzzjJCYqF7okpYBWdr1BBu6kvBcwYPlgdxydmah9kDKHEI+kgh9/yrHt3CPSPo08Ydv6KqnrIWIW1V9I1TYfS6N6xnko372uVgfsrhUntz4RX89KQIDY4d/dE+Xut67cHhyj1cPk/Mn6vDHcVwqQ/5LqVY0GmPO7j926VbmEqA0= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1748594094111791.8835161817933; Fri, 30 May 2025 01:34:54 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1uKvC8-0007ZZ-85; Fri, 30 May 2025 04:34:28 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uKvC6-0007Yj-Ab for qemu-devel@nongnu.org; Fri, 30 May 2025 04:34:26 -0400 Received: from mgamail.intel.com ([198.175.65.12]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uKvC2-0007uO-Vs for qemu-devel@nongnu.org; Fri, 30 May 2025 04:34:25 -0400 Received: from orviesa007.jf.intel.com ([10.64.159.147]) by orvoesa104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 May 2025 01:34:15 -0700 Received: from emr-bkc.sh.intel.com ([10.112.230.82]) by orviesa007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 May 2025 01:34:11 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1748594063; x=1780130063; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Jcki6Ig/z48Z8/JtNpEEwi+YNYwIfy2K7ci4NMGgwgc=; b=VzDzKuCt+En5fzsDg+a7tv9GKFsLZ7Swxxo7t0byFHUav2czywrhY4Xj +qEUnbWD3sVFoO/Uhby7NmP0Uor3tS08yIVy3dkIgK4Zg+ItweyGdMq1C yllqETkl/VrqVEnlqoZh22VU7LpKMBsHudrs31c2USgeoFRsxu9EA+WmY Y/QEFHRC6JIf323uXSySz9oa85QRLDayaTk14r5G/z31H7B7o90eFtart bu/A8kdAE3JLVBKHliantmqV6eeTjLG0hqQLUgjd3txUvl/vdDxLcP8IT GAGbVUbwshByy5RqqYHYFFOKH2IiuRFXzYXekB6/dZRDMojym6XNTJI7Q w==; X-CSE-ConnectionGUID: btDajEmQQGKguCF6pfQivQ== X-CSE-MsgGUID: /+A9he+pQUK++dtZRMTCAQ== X-IronPort-AV: E=McAfee;i="6700,10204,11448"; a="62081544" X-IronPort-AV: E=Sophos;i="6.16,195,1744095600"; d="scan'208";a="62081544" X-CSE-ConnectionGUID: JB0NGskLRH6e8HPblP87hw== X-CSE-MsgGUID: Y0kyjBP1RSeuJp0BROqhbw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,195,1744095600"; d="scan'208";a="144453790" From: Chenyi Qiang To: David Hildenbrand , Alexey Kardashevskiy , Peter Xu , Gupta Pankaj , Paolo Bonzini , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= , Michael Roth Cc: Chenyi Qiang , qemu-devel@nongnu.org, kvm@vger.kernel.org, Williams Dan J , Zhao Liu , Baolu Lu , Gao Chao , Xu Yilun , Li Xiaoyao , =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= , Alex Williamson Subject: [PATCH v6 4/5] ram-block-attributes: Introduce RamBlockAttributes to manage RAMBlock with guest_memfd Date: Fri, 30 May 2025 16:32:53 +0800 Message-ID: <20250530083256.105186-5-chenyi.qiang@intel.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250530083256.105186-1-chenyi.qiang@intel.com> References: <20250530083256.105186-1-chenyi.qiang@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=198.175.65.12; envelope-from=chenyi.qiang@intel.com; helo=mgamail.intel.com X-Spam_score_int: -72 X-Spam_score: -7.3 X-Spam_bar: ------- X-Spam_report: (-7.3 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-2.902, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @intel.com) X-ZM-MESSAGEID: 1748594095746116600 Content-Type: text/plain; charset="utf-8" Commit 852f0048f3 ("RAMBlock: make guest_memfd require uncoordinated discard") highlighted that subsystems like VFIO may disable RAM block discard. However, guest_memfd relies on discard operations for page conversion between private and shared memory, potentially leading to the stale IOMMU mapping issue when assigning hardware devices to confidential VMs via shared memory. To address this and allow shared device assignement, it is crucial to ensure the VFIO system refreshes its IOMMU mappings. RamDiscardManager is an existing interface (used by virtio-mem) to adjust VFIO mappings in relation to VM page assignment. Effectively page conversion is similar to hot-removing a page in one mode and adding it back in the other. Therefore, similar actions are required for page conversion events. Introduce the RamDiscardManager to guest_memfd to facilitate this process. Since guest_memfd is not an object, it cannot directly implement the RamDiscardManager interface. Implementing it in HostMemoryBackend is not appropriate because guest_memfd is per RAMBlock, and some RAMBlocks have a memory backend while others do not. Notably, virtual BIOS RAMBlocks using memory_region_init_ram_guest_memfd() do not have a backend. To manage RAMBlocks with guest_memfd, define a new object named RamBlockAttributes to implement the RamDiscardManager interface. This object can store the guest_memfd information such as bitmap for shared memory and the registered listeners for event notification. In the context of RamDiscardManager, shared state is analogous to populated, and private state is signified as discarded. To notify the conversion events, a new state_change() helper is exported for the users to notify the listeners like VFIO, so that VFIO can dynamically DMA map/unmap the shared mapping. Note that the memory state is tracked at the host page size granularity, as the minimum conversion size can be one page per request and VFIO expects the DMA mapping for a specific iova to be mapped and unmapped with the same granularity. Confidential VMs may perform partial conversions, such as conversions on small regions within larger ones. To prevent such invalid cases and until DMA mapping cut operation support is available, all operations are performed with 4K granularity. In addition, memory conversion failures cause QEMU to quit instead of resuming the guest or retrying the operation at present. It would be future work to add more error handling or rollback mechanisms once conversion failures are allowed. For example, in-place conversion of guest_memfd could retry the unmap operation during the conversion from shared to private. For now, keep the complex error handling out of the picture as it is not required. Signed-off-by: Chenyi Qiang Reviewed-by: Pankaj Gupta --- Changes in v6: - Change the object type name from RamBlockAttribute to RamBlockAttributes. (David) - Save the associated RAMBlock instead MemoryRegion in RamBlockAttributes. (David) - Squash the state_change() helper introduction in this commit as well as the mixture conversion case handling. (David) - Change the block_size type from int to size_t and some cleanup in validation check. (Alexey) - Add a tracepoint to track the state changes. (Alexey) Changes in v5: - Revert to use RamDiscardManager interface instead of introducing new hierarchy of class to manage private/shared state, and keep using the new name of RamBlockAttribute compared with the MemoryAttributeManager in v3. - Use *simple* version of object_define and object_declare since the state_change() function is changed as an exported function instead of a virtual function in later patch. - Move the introduction of RamBlockAttribute field to this patch and rename it to ram_shared. (Alexey) - call the exit() when register/unregister failed. (Zhao) - Add the ram-block-attribute.c to Memory API related part in MAINTAINERS. Changes in v4: - Change the name from memory-attribute-manager to ram-block-attribute. - Implement the newly-introduced PrivateSharedManager instead of RamDiscardManager and change related commit message. - Define the new object in ramblock.h instead of adding a new file. --- MAINTAINERS | 1 + include/system/ramblock.h | 21 ++ system/meson.build | 1 + system/ram-block-attributes.c | 480 ++++++++++++++++++++++++++++++++++ system/trace-events | 3 + 5 files changed, 506 insertions(+) create mode 100644 system/ram-block-attributes.c diff --git a/MAINTAINERS b/MAINTAINERS index 6dacd6d004..8ec39aa7f8 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -3149,6 +3149,7 @@ F: system/memory.c F: system/memory_mapping.c F: system/physmem.c F: system/memory-internal.h +F: system/ram-block-attributes.c F: scripts/coccinelle/memory-region-housekeeping.cocci =20 Memory devices diff --git a/include/system/ramblock.h b/include/system/ramblock.h index d8a116ba99..1bab9e2dac 100644 --- a/include/system/ramblock.h +++ b/include/system/ramblock.h @@ -22,6 +22,10 @@ #include "exec/cpu-common.h" #include "qemu/rcu.h" #include "exec/ramlist.h" +#include "system/hostmem.h" + +#define TYPE_RAM_BLOCK_ATTRIBUTES "ram-block-attributes" +OBJECT_DECLARE_SIMPLE_TYPE(RamBlockAttributes, RAM_BLOCK_ATTRIBUTES) =20 struct RAMBlock { struct rcu_head rcu; @@ -91,4 +95,21 @@ struct RAMBlock { ram_addr_t postcopy_length; }; =20 +struct RamBlockAttributes { + Object parent; + + RAMBlock *ram_block; + + /* 1-setting of the bitmap represents ram is populated (shared) */ + unsigned bitmap_size; + unsigned long *bitmap; + + QLIST_HEAD(, RamDiscardListener) rdl_list; +}; + +RamBlockAttributes *ram_block_attributes_create(RAMBlock *ram_block); +void ram_block_attributes_destroy(RamBlockAttributes *attr); +int ram_block_attributes_state_change(RamBlockAttributes *attr, uint64_t o= ffset, + uint64_t size, bool to_discard); + #endif diff --git a/system/meson.build b/system/meson.build index c2f0082766..2747dbde80 100644 --- a/system/meson.build +++ b/system/meson.build @@ -17,6 +17,7 @@ libsystem_ss.add(files( 'dma-helpers.c', 'globals.c', 'ioport.c', + 'ram-block-attributes.c', 'memory_mapping.c', 'memory.c', 'physmem.c', diff --git a/system/ram-block-attributes.c b/system/ram-block-attributes.c new file mode 100644 index 0000000000..514252413f --- /dev/null +++ b/system/ram-block-attributes.c @@ -0,0 +1,480 @@ +/* + * QEMU ram block attributes + * + * Copyright Intel + * + * Author: + * Chenyi Qiang + * + * This work is licensed under the terms of the GNU GPL, version 2 or late= r. + * See the COPYING file in the top-level directory + * + */ + +#include "qemu/osdep.h" +#include "qemu/error-report.h" +#include "system/ramblock.h" +#include "trace.h" + +OBJECT_DEFINE_SIMPLE_TYPE_WITH_INTERFACES(RamBlockAttributes, + ram_block_attributes, + RAM_BLOCK_ATTRIBUTES, + OBJECT, + { TYPE_RAM_DISCARD_MANAGER }, + { }) + +static size_t +ram_block_attributes_get_block_size(const RamBlockAttributes *attr) +{ + /* + * Because page conversion could be manipulated in the size of at leas= t 4K + * or 4K aligned, Use the host page size as the granularity to track t= he + * memory attribute. + */ + g_assert(attr && attr->ram_block); + g_assert(attr->ram_block->page_size =3D=3D qemu_real_host_page_size()); + return attr->ram_block->page_size; +} + + +static bool +ram_block_attributes_rdm_is_populated(const RamDiscardManager *rdm, + const MemoryRegionSection *section) +{ + const RamBlockAttributes *attr =3D RAM_BLOCK_ATTRIBUTES(rdm); + const size_t block_size =3D ram_block_attributes_get_block_size(attr); + const uint64_t first_bit =3D section->offset_within_region / block_siz= e; + const uint64_t last_bit =3D first_bit + int128_get64(section->size) / = block_size - 1; + unsigned long first_discarded_bit; + + first_discarded_bit =3D find_next_zero_bit(attr->bitmap, last_bit + 1, + first_bit); + return first_discarded_bit > last_bit; +} + +typedef int (*ram_block_attributes_section_cb)(MemoryRegionSection *s, + void *arg); + +static int +ram_block_attributes_notify_populate_cb(MemoryRegionSection *section, + void *arg) +{ + RamDiscardListener *rdl =3D arg; + + return rdl->notify_populate(rdl, section); +} + +static int +ram_block_attributes_notify_discard_cb(MemoryRegionSection *section, + void *arg) +{ + RamDiscardListener *rdl =3D arg; + + rdl->notify_discard(rdl, section); + return 0; +} + +static int +ram_block_attributes_for_each_populated_section(const RamBlockAttributes *= attr, + MemoryRegionSection *secti= on, + void *arg, + ram_block_attributes_secti= on_cb cb) +{ + unsigned long first_bit, last_bit; + uint64_t offset, size; + const size_t block_size =3D ram_block_attributes_get_block_size(attr); + int ret =3D 0; + + first_bit =3D section->offset_within_region / block_size; + first_bit =3D find_next_bit(attr->bitmap, attr->bitmap_size, + first_bit); + + while (first_bit < attr->bitmap_size) { + MemoryRegionSection tmp =3D *section; + + offset =3D first_bit * block_size; + last_bit =3D find_next_zero_bit(attr->bitmap, attr->bitmap_size, + first_bit + 1) - 1; + size =3D (last_bit - first_bit + 1) * block_size; + + if (!memory_region_section_intersect_range(&tmp, offset, size)) { + break; + } + + ret =3D cb(&tmp, arg); + if (ret) { + error_report("%s: Failed to notify RAM discard listener: %s", + __func__, strerror(-ret)); + break; + } + + first_bit =3D find_next_bit(attr->bitmap, attr->bitmap_size, + last_bit + 2); + } + + return ret; +} + +static int +ram_block_attributes_for_each_discarded_section(const RamBlockAttributes *= attr, + MemoryRegionSection *secti= on, + void *arg, + ram_block_attributes_secti= on_cb cb) +{ + unsigned long first_bit, last_bit; + uint64_t offset, size; + const size_t block_size =3D ram_block_attributes_get_block_size(attr); + int ret =3D 0; + + first_bit =3D section->offset_within_region / block_size; + first_bit =3D find_next_zero_bit(attr->bitmap, attr->bitmap_size, + first_bit); + + while (first_bit < attr->bitmap_size) { + MemoryRegionSection tmp =3D *section; + + offset =3D first_bit * block_size; + last_bit =3D find_next_bit(attr->bitmap, attr->bitmap_size, + first_bit + 1) - 1; + size =3D (last_bit - first_bit + 1) * block_size; + + if (!memory_region_section_intersect_range(&tmp, offset, size)) { + break; + } + + ret =3D cb(&tmp, arg); + if (ret) { + error_report("%s: Failed to notify RAM discard listener: %s", + __func__, strerror(-ret)); + break; + } + + first_bit =3D find_next_zero_bit(attr->bitmap, + attr->bitmap_size, + last_bit + 2); + } + + return ret; +} + +static uint64_t +ram_block_attributes_rdm_get_min_granularity(const RamDiscardManager *rdm, + const MemoryRegion *mr) +{ + const RamBlockAttributes *attr =3D RAM_BLOCK_ATTRIBUTES(rdm); + + g_assert(mr =3D=3D attr->ram_block->mr); + return ram_block_attributes_get_block_size(attr); +} + +static void +ram_block_attributes_rdm_register_listener(RamDiscardManager *rdm, + RamDiscardListener *rdl, + MemoryRegionSection *section) +{ + RamBlockAttributes *attr =3D RAM_BLOCK_ATTRIBUTES(rdm); + int ret; + + g_assert(section->mr =3D=3D attr->ram_block->mr); + rdl->section =3D memory_region_section_new_copy(section); + + QLIST_INSERT_HEAD(&attr->rdl_list, rdl, next); + + ret =3D ram_block_attributes_for_each_populated_section(attr, section,= rdl, + ram_block_attributes_notify_populate_c= b); + if (ret) { + error_report("%s: Failed to register RAM discard listener: %s", + __func__, strerror(-ret)); + exit(1); + } +} + +static void +ram_block_attributes_rdm_unregister_listener(RamDiscardManager *rdm, + RamDiscardListener *rdl) +{ + RamBlockAttributes *attr =3D RAM_BLOCK_ATTRIBUTES(rdm); + int ret; + + g_assert(rdl->section); + g_assert(rdl->section->mr =3D=3D attr->ram_block->mr); + + if (rdl->double_discard_supported) { + rdl->notify_discard(rdl, rdl->section); + } else { + ret =3D ram_block_attributes_for_each_populated_section(attr, + rdl->section, rdl, ram_block_attributes_notify_discard_cb); + if (ret) { + error_report("%s: Failed to unregister RAM discard listener: %= s", + __func__, strerror(-ret)); + exit(1); + } + } + + memory_region_section_free_copy(rdl->section); + rdl->section =3D NULL; + QLIST_REMOVE(rdl, next); +} + +typedef struct RamBlockAttributesReplayData { + ReplayRamDiscardState fn; + void *opaque; +} RamBlockAttributesReplayData; + +static int ram_block_attributes_rdm_replay_cb(MemoryRegionSection *section, + void *arg) +{ + RamBlockAttributesReplayData *data =3D arg; + + return data->fn(section, data->opaque); +} + +static int +ram_block_attributes_rdm_replay_populated(const RamDiscardManager *rdm, + MemoryRegionSection *section, + ReplayRamDiscardState replay_fn, + void *opaque) +{ + RamBlockAttributes *attr =3D RAM_BLOCK_ATTRIBUTES(rdm); + RamBlockAttributesReplayData data =3D { .fn =3D replay_fn, .opaque =3D= opaque }; + + g_assert(section->mr =3D=3D attr->ram_block->mr); + return ram_block_attributes_for_each_populated_section(attr, section, = &data, + ram_block_attributes_rdm_repla= y_cb); +} + +static int +ram_block_attributes_rdm_replay_discarded(const RamDiscardManager *rdm, + MemoryRegionSection *section, + ReplayRamDiscardState replay_fn, + void *opaque) +{ + RamBlockAttributes *attr =3D RAM_BLOCK_ATTRIBUTES(rdm); + RamBlockAttributesReplayData data =3D { .fn =3D replay_fn, .opaque =3D= opaque }; + + g_assert(section->mr =3D=3D attr->ram_block->mr); + return ram_block_attributes_for_each_discarded_section(attr, section, = &data, + ram_block_attributes_rdm_repla= y_cb); +} + +static bool +ram_block_attributes_is_valid_range(RamBlockAttributes *attr, uint64_t off= set, + uint64_t size) +{ + MemoryRegion *mr =3D attr->ram_block->mr; + + g_assert(mr); + + uint64_t region_size =3D memory_region_size(mr); + const size_t block_size =3D ram_block_attributes_get_block_size(attr); + + if (!QEMU_IS_ALIGNED(offset, block_size) || + !QEMU_IS_ALIGNED(size, block_size)) { + return false; + } + if (offset + size <=3D offset) { + return false; + } + if (offset + size > region_size) { + return false; + } + return true; +} + +static void ram_block_attributes_notify_discard(RamBlockAttributes *attr, + uint64_t offset, + uint64_t size) +{ + RamDiscardListener *rdl; + + QLIST_FOREACH(rdl, &attr->rdl_list, next) { + MemoryRegionSection tmp =3D *rdl->section; + + if (!memory_region_section_intersect_range(&tmp, offset, size)) { + continue; + } + rdl->notify_discard(rdl, &tmp); + } +} + +static int +ram_block_attributes_notify_populate(RamBlockAttributes *attr, + uint64_t offset, uint64_t size) +{ + RamDiscardListener *rdl; + int ret =3D 0; + + QLIST_FOREACH(rdl, &attr->rdl_list, next) { + MemoryRegionSection tmp =3D *rdl->section; + + if (!memory_region_section_intersect_range(&tmp, offset, size)) { + continue; + } + ret =3D rdl->notify_populate(rdl, &tmp); + if (ret) { + break; + } + } + + return ret; +} + +static bool ram_block_attributes_is_range_populated(RamBlockAttributes *at= tr, + uint64_t offset, + uint64_t size) +{ + const size_t block_size =3D ram_block_attributes_get_block_size(attr); + const unsigned long first_bit =3D offset / block_size; + const unsigned long last_bit =3D first_bit + (size / block_size) - 1; + unsigned long found_bit; + + found_bit =3D find_next_zero_bit(attr->bitmap, last_bit + 1, + first_bit); + return found_bit > last_bit; +} + +static bool +ram_block_attributes_is_range_discarded(RamBlockAttributes *attr, + uint64_t offset, uint64_t size) +{ + const size_t block_size =3D ram_block_attributes_get_block_size(attr); + const unsigned long first_bit =3D offset / block_size; + const unsigned long last_bit =3D first_bit + (size / block_size) - 1; + unsigned long found_bit; + + found_bit =3D find_next_bit(attr->bitmap, last_bit + 1, first_bit); + return found_bit > last_bit; +} + +int ram_block_attributes_state_change(RamBlockAttributes *attr, + uint64_t offset, uint64_t size, + bool to_discard) +{ + const size_t block_size =3D ram_block_attributes_get_block_size(attr); + const unsigned long first_bit =3D offset / block_size; + const unsigned long nbits =3D size / block_size; + bool is_range_discarded, is_range_populated; + const uint64_t end =3D offset + size; + unsigned long bit; + uint64_t cur; + int ret =3D 0; + + if (!ram_block_attributes_is_valid_range(attr, offset, size)) { + error_report("%s, invalid range: offset 0x%lx, size 0x%lx", + __func__, offset, size); + return -EINVAL; + } + + is_range_discarded =3D ram_block_attributes_is_range_discarded(attr, o= ffset, + size); + is_range_populated =3D ram_block_attributes_is_range_populated(attr, o= ffset, + size); + + trace_ram_block_attributes_state_change(offset, size, + is_range_discarded ? "discarde= d" : + is_range_populated ? "populate= d" : + "mixture", + to_discard ? "discarded" : + "populated"); + if (to_discard) { + if (is_range_discarded) { + /* Already private */ + } else if (is_range_populated) { + /* Completely shared */ + bitmap_clear(attr->bitmap, first_bit, nbits); + ram_block_attributes_notify_discard(attr, offset, size); + } else { + /* Unexpected mixture: process individual blocks */ + for (cur =3D offset; cur < end; cur +=3D block_size) { + bit =3D cur / block_size; + if (!test_bit(bit, attr->bitmap)) { + continue; + } + clear_bit(bit, attr->bitmap); + ram_block_attributes_notify_discard(attr, cur, block_size); + } + } + } else { + if (is_range_populated) { + /* Already shared */ + } else if (is_range_discarded) { + /* Complete private */ + bitmap_set(attr->bitmap, first_bit, nbits); + ret =3D ram_block_attributes_notify_populate(attr, offset, siz= e); + } else { + /* Unexpected mixture: process individual blocks */ + for (cur =3D offset; cur < end; cur +=3D block_size) { + bit =3D cur / block_size; + if (test_bit(bit, attr->bitmap)) { + continue; + } + set_bit(bit, attr->bitmap); + ret =3D ram_block_attributes_notify_populate(attr, cur, + block_size); + if (ret) { + break; + } + } + } + } + + return ret; +} + +RamBlockAttributes *ram_block_attributes_create(RAMBlock *ram_block) +{ + uint64_t bitmap_size; + const int block_size =3D qemu_real_host_page_size(); + RamBlockAttributes *attr; + int ret; + MemoryRegion *mr =3D ram_block->mr; + + attr =3D RAM_BLOCK_ATTRIBUTES(object_new(TYPE_RAM_BLOCK_ATTRIBUTES)); + + attr->ram_block =3D ram_block; + ret =3D memory_region_set_ram_discard_manager(mr, RAM_DISCARD_MANAGER(= attr)); + if (ret) { + object_unref(OBJECT(attr)); + return NULL; + } + bitmap_size =3D ROUND_UP(mr->size, block_size) / block_size; + attr->bitmap_size =3D bitmap_size; + attr->bitmap =3D bitmap_new(bitmap_size); + + return attr; +} + +void ram_block_attributes_destroy(RamBlockAttributes *attr) +{ + if (!attr) { + return; + } + + g_free(attr->bitmap); + memory_region_set_ram_discard_manager(attr->ram_block->mr, NULL); + object_unref(OBJECT(attr)); +} + +static void ram_block_attributes_init(Object *obj) +{ + RamBlockAttributes *attr =3D RAM_BLOCK_ATTRIBUTES(obj); + + QLIST_INIT(&attr->rdl_list); +} + +static void ram_block_attributes_finalize(Object *obj) +{ +} + +static void ram_block_attributes_class_init(ObjectClass *klass, + const void *data) +{ + RamDiscardManagerClass *rdmc =3D RAM_DISCARD_MANAGER_CLASS(klass); + + rdmc->get_min_granularity =3D ram_block_attributes_rdm_get_min_granula= rity; + rdmc->register_listener =3D ram_block_attributes_rdm_register_listener; + rdmc->unregister_listener =3D ram_block_attributes_rdm_unregister_list= ener; + rdmc->is_populated =3D ram_block_attributes_rdm_is_populated; + rdmc->replay_populated =3D ram_block_attributes_rdm_replay_populated; + rdmc->replay_discarded =3D ram_block_attributes_rdm_replay_discarded; +} diff --git a/system/trace-events b/system/trace-events index be12ebfb41..82856e44f2 100644 --- a/system/trace-events +++ b/system/trace-events @@ -52,3 +52,6 @@ dirtylimit_state_finalize(void) dirtylimit_throttle_pct(int cpu_index, uint64_t pct, int64_t time_us) "CPU= [%d] throttle percent: %" PRIu64 ", throttle adjust time %"PRIi64 " us" dirtylimit_set_vcpu(int cpu_index, uint64_t quota) "CPU[%d] set dirty page= rate limit %"PRIu64 dirtylimit_vcpu_execute(int cpu_index, int64_t sleep_time_us) "CPU[%d] sle= ep %"PRIi64 " us" + +# ram-block-attributes.c +ram_block_attributes_state_change(uint64_t offset, uint64_t size, const ch= ar *from, const char *to) "offset 0x%"PRIx64" size 0x%"PRIx64" from '%s' to= '%s'" --=20 2.43.5 From nobody Sat Nov 15 17:58:29 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=intel.com ARC-Seal: i=1; a=rsa-sha256; t=1748594120; cv=none; d=zohomail.com; s=zohoarc; b=g9UFv7lyie/p9be9YfnkyCYFPKTs+ZY4NdxDFhApcxjLRWr2BfFGjAzqMS0gSAo8hl+KEG++PDFVmnUE5gHitxgKMdwvmtK1PeAoX6AvtOsUVVsJrsbM5yg3BepHUdmGb96tfM9+ptvJK5Ss8KBu229ZQ4JnyIWZw5jv3NzJtM0= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1748594120; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=pvMLKqQPVW3XSk8wn+JtJtPuwiUQ51s64F7tPdewm7E=; b=lXYeWbZCE6yxTY4crVJJatx2LNBrjpnWv08pivUYEhbZhkkMC3a1tINIbbPNR0+XNx0a8ce0DbGkaKvJUJiwmxx3KfQTKEx+NWZzY4Zt+Awo+0RN8pADSsFhnAdroUkSJrVcySN8aGvo3tcVNvwyleXCpszicInSdC+p5wNs5Fg= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1748594120322641.9316331622579; Fri, 30 May 2025 01:35:20 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1uKvCB-0007ax-5u; Fri, 30 May 2025 04:34:31 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uKvC9-0007Zj-4d for qemu-devel@nongnu.org; Fri, 30 May 2025 04:34:29 -0400 Received: from mgamail.intel.com ([198.175.65.12]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uKvC7-0007uI-6r for qemu-devel@nongnu.org; Fri, 30 May 2025 04:34:28 -0400 Received: from orviesa007.jf.intel.com ([10.64.159.147]) by orvoesa104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 May 2025 01:34:19 -0700 Received: from emr-bkc.sh.intel.com ([10.112.230.82]) by orviesa007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 May 2025 01:34:15 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1748594067; x=1780130067; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=GrMUBFtn4Dj0h+vdvqfh8N4b0D8OE9P2jqndkfBDheM=; b=XcrxnYcWoqZnyUV002sxMHkG6bPNT5Bx8ta6p6Gpz/vCiGDjgMzBTN7p c2my17tsuqg3F87eEzkeBsNIAVWjGuT0d128F3NUinnExEL0NcCmt/Y+o l+KLHUXew2jUM1BOLGgkk3CHshPIhZQY/hdmUj0GxOirZMgW+Ze93A+k8 KpThjdozrpnkJQwm6ukX27hvKMHNSdf3uzCWd9QQGRGJU5Oh4fj5WBcon 9Obp9yHhaAvC1nfeer2azTnPlK4KjtCGHlfWD6lFsaPodBf3d1NKZHG7f XIeCmWVhiKayqRAkRoyvf8W9JjVvlaJdQXLgQW3d08g25HvaK7dxFywot w==; X-CSE-ConnectionGUID: ky+fmnj2QomdQVLl3zUrHg== X-CSE-MsgGUID: /MDJH0DHTBmjcdPFBroc1A== X-IronPort-AV: E=McAfee;i="6700,10204,11448"; a="62081561" X-IronPort-AV: E=Sophos;i="6.16,195,1744095600"; d="scan'208";a="62081561" X-CSE-ConnectionGUID: Re9g73xaSBWt9qMaEXbiKQ== X-CSE-MsgGUID: 74+hMB1AQIqNLDnEk4JeOg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,195,1744095600"; d="scan'208";a="144453800" From: Chenyi Qiang To: David Hildenbrand , Alexey Kardashevskiy , Peter Xu , Gupta Pankaj , Paolo Bonzini , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= , Michael Roth Cc: Chenyi Qiang , qemu-devel@nongnu.org, kvm@vger.kernel.org, Williams Dan J , Zhao Liu , Baolu Lu , Gao Chao , Xu Yilun , Li Xiaoyao , =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= , Alex Williamson Subject: [PATCH v6 5/5] physmem: Support coordinated discarding of RAM with guest_memfd Date: Fri, 30 May 2025 16:32:54 +0800 Message-ID: <20250530083256.105186-6-chenyi.qiang@intel.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250530083256.105186-1-chenyi.qiang@intel.com> References: <20250530083256.105186-1-chenyi.qiang@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=198.175.65.12; envelope-from=chenyi.qiang@intel.com; helo=mgamail.intel.com X-Spam_score_int: -72 X-Spam_score: -7.3 X-Spam_bar: ------- X-Spam_report: (-7.3 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-2.902, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @intel.com) X-ZM-MESSAGEID: 1748594122024116600 Content-Type: text/plain; charset="utf-8" A new field, attributes, was introduced in RAMBlock to link to a RamBlockAttributes object, which centralizes all guest_memfd related information (such as fd and shared bitmap) within a RAMBlock. Create and initialize the RamBlockAttributes object upon ram_block_add(). Meanwhile, register the object in the target RAMBlock's MemoryRegion. After that, guest_memfd-backed RAMBlock is associated with the RamDiscardManager interface, and the users can execute RamDiscardManager specific handling. For example, VFIO will register the RamDiscardListener and get notifications when the state_change() helper invokes. As coordinate discarding of RAM with guest_memfd is now supported, only block uncoordinated discard. Signed-off-by: Chenyi Qiang --- Changes in v6: - Squash the unblocking of cooridnate discard into this commit. - Remove the checks in migration path. Changes in v5: - Revert to use RamDiscardManager interface. - Move the object_new() into the ram_block_attribute_create() helper. - Add some check in migration path. Changes in v4: - Remove the replay operations for attribute changes which will be handled in a listener in following patches. - Add some comment in the error path of realize() to remind the future development of the unified error path. Changes in v3: - Use ram_discard_manager_reply_populated/discarded() to set the memory attribute and add the undo support if state_change() failed. - Didn't add Reviewed-by from Alexey due to the new changes in this commit. --- accel/kvm/kvm-all.c | 9 +++++++++ include/system/ramblock.h | 1 + system/physmem.c | 18 ++++++++++++++++-- 3 files changed, 26 insertions(+), 2 deletions(-) diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index 51526d301b..3b390bbb09 100644 --- a/accel/kvm/kvm-all.c +++ b/accel/kvm/kvm-all.c @@ -3089,6 +3089,15 @@ int kvm_convert_memory(hwaddr start, hwaddr size, bo= ol to_private) addr =3D memory_region_get_ram_ptr(mr) + section.offset_within_region; rb =3D qemu_ram_block_from_host(addr, false, &offset); =20 + ret =3D ram_block_attributes_state_change(RAM_BLOCK_ATTRIBUTES(mr->rdm= ), + offset, size, to_private); + if (ret) { + error_report("Failed to notify the listener the state change of " + "(0x%"HWADDR_PRIx" + 0x%"HWADDR_PRIx") to %s", + start, size, to_private ? "private" : "shared"); + goto out_unref; + } + if (to_private) { if (rb->page_size !=3D qemu_real_host_page_size()) { /* diff --git a/include/system/ramblock.h b/include/system/ramblock.h index 1bab9e2dac..87e847e184 100644 --- a/include/system/ramblock.h +++ b/include/system/ramblock.h @@ -46,6 +46,7 @@ struct RAMBlock { int fd; uint64_t fd_offset; int guest_memfd; + RamBlockAttributes *attributes; size_t page_size; /* dirty bitmap used during migration */ unsigned long *bmap; diff --git a/system/physmem.c b/system/physmem.c index a8a9ca309e..1f1217fa0a 100644 --- a/system/physmem.c +++ b/system/physmem.c @@ -1916,7 +1916,7 @@ static void ram_block_add(RAMBlock *new_block, Error = **errp) } assert(new_block->guest_memfd < 0); =20 - ret =3D ram_block_discard_require(true); + ret =3D ram_block_coordinated_discard_require(true); if (ret < 0) { error_setg_errno(errp, -ret, "cannot set up private guest memory: discard = currently blocked"); @@ -1931,6 +1931,19 @@ static void ram_block_add(RAMBlock *new_block, Error= **errp) goto out_free; } =20 + new_block->attributes =3D ram_block_attributes_create(new_block); + if (!new_block->attributes) { + error_setg(errp, "Failed to create ram block attribute"); + /* + * The error path could be unified if the rest of ram_block_ad= d() + * ever develops a need to check for errors. + */ + close(new_block->guest_memfd); + ram_block_coordinated_discard_require(false); + qemu_mutex_unlock_ramlist(); + goto out_free; + } + /* * Add a specific guest_memfd blocker if a generic one would not be * added by ram_block_add_cpr_blocker. @@ -2287,8 +2300,9 @@ static void reclaim_ramblock(RAMBlock *block) } =20 if (block->guest_memfd >=3D 0) { + ram_block_attributes_destroy(block->attributes); close(block->guest_memfd); - ram_block_discard_require(false); + ram_block_coordinated_discard_require(false); } =20 g_free(block); --=20 2.43.5