From nobody Mon Mar 2 11:03:57 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1772115175; cv=none; d=zohomail.com; s=zohoarc; b=EwZhfXqoRPvenU5gG7DJpBmok/06ZKVi4DFYPUP6vLXO/NYAJg3UtTKYsCzI6iDYCMe4MD1VcECEkJklZo85xV2z11N6Mm/o5Wwjo8buoPtphCUYUBvNrELnrZfQFrF0JEEWrPvr+sbvTC+XJLosVfaOjlvX14HSpu35ZhTpvSE= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1772115175; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=mhPrQpvBs8GKNq4C5LYoOh9SwlPDiZEjUUdS1fYmPtA=; b=Dwc+RuCfedc/Ajoj3/0/N8kYzfcEgqQcKKdhBBP2A+Paa1f8QdkL94GIHmNj/E46bRdF6v9ebqA+psPJNvgkWXLS5muTmRz9oVyVQJD3q8Ex5iAu+ADuTE1qLTtv4YIaS3gc2M/LZ3zKqrae8rwLpiB3ZsqJkIBDR8v6nBK8onM= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1772115175097460.44691687520367; Thu, 26 Feb 2026 06:12:55 -0800 (PST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1vvbvO-00031y-7H; Thu, 26 Feb 2026 09:01:06 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1vvbvF-0002ln-ND for qemu-devel@nongnu.org; Thu, 26 Feb 2026 09:01:01 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1vvbvB-0006Lp-0u for qemu-devel@nongnu.org; Thu, 26 Feb 2026 09:00:56 -0500 Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-479-dfXLqYUTPiuPjSbucokiWw-1; Thu, 26 Feb 2026 09:00:46 -0500 Received: from mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.12]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 1E57F18001E3; Thu, 26 Feb 2026 14:00:45 +0000 (UTC) Received: from localhost (unknown [10.45.242.29]) by mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 61FFB19560B5; Thu, 26 Feb 2026 14:00:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1772114452; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=mhPrQpvBs8GKNq4C5LYoOh9SwlPDiZEjUUdS1fYmPtA=; b=cDIsaCx5hJBMIUU6qP6CPN5CnvhfWx7jKtz0yGqfr3q9M8vrwXMKK7J9G+1yo4gBTlfTEu ev91Ydm2xg/y6N2qwi0aH3PQ+RuMusdjOQ463NvYX8r2fb9ODgjr2/Jj6H94yiw558cAgb NAZ3AULNBOjpSObpHFpwAceMPEej580= X-MC-Unique: dfXLqYUTPiuPjSbucokiWw-1 X-Mimecast-MFC-AGG-ID: dfXLqYUTPiuPjSbucokiWw_1772114445 From: marcandre.lureau@redhat.com To: qemu-devel@nongnu.org Cc: Ben Chaney , "Michael S. Tsirkin" , =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= , Paolo Bonzini , Alex Williamson , Fabiano Rosas , David Hildenbrand , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= , Peter Xu , kvm@vger.kernel.org, Mark Kanda , =?UTF-8?q?Marc-Andr=C3=A9=20Lureau?= Subject: [PATCH v3 12/15] system/memory: implement RamDiscardManager multi-source aggregation Date: Thu, 26 Feb 2026 14:59:57 +0100 Message-ID: <20260226140001.3622334-13-marcandre.lureau@redhat.com> In-Reply-To: <20260226140001.3622334-1-marcandre.lureau@redhat.com> References: <20260226140001.3622334-1-marcandre.lureau@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.0 on 10.30.177.12 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=marcandre.lureau@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -10 X-Spam_score: -1.1 X-Spam_bar: - X-Spam_report: (-1.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.306, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.668, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: qemu development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1772115201414158500 From: Marc-Andr=C3=A9 Lureau Refactor RamDiscardManager to aggregate multiple RamDiscardSource instances. This enables scenarios where multiple components (e.g., virtio-mem and RamBlockAttributes) can coordinate memory discard state for the same memory region. The aggregation uses: - Populated: ALL sources populated - Discarded: ANY source discarded When a source is added with existing listeners, they are notified about regions that become discarded. When a source is removed, listeners are notified about regions that become populated. Signed-off-by: Marc-Andr=C3=A9 Lureau --- include/system/ram-discard-manager.h | 143 +++++++-- hw/virtio/virtio-mem.c | 8 +- system/memory.c | 15 +- system/ram-block-attributes.c | 6 +- system/ram-discard-manager.c | 427 ++++++++++++++++++++++++--- 5 files changed, 515 insertions(+), 84 deletions(-) diff --git a/include/system/ram-discard-manager.h b/include/system/ram-disc= ard-manager.h index b5dbcb4a82d..9d650ee4d7b 100644 --- a/include/system/ram-discard-manager.h +++ b/include/system/ram-discard-manager.h @@ -170,30 +170,96 @@ struct RamDiscardSourceClass { * becoming discarded in a different granularity than it was populated and= the * other way around. */ + +typedef struct RamDiscardSourceEntry RamDiscardSourceEntry; + +struct RamDiscardSourceEntry { + RamDiscardSource *rds; + QLIST_ENTRY(RamDiscardSourceEntry) next; +}; + struct RamDiscardManager { Object parent; =20 - RamDiscardSource *rds; - MemoryRegion *mr; + struct MemoryRegion *mr; + QLIST_HEAD(, RamDiscardSourceEntry) source_list; + uint64_t min_granularity; QLIST_HEAD(, RamDiscardListener) rdl_list; }; =20 -RamDiscardManager *ram_discard_manager_new(MemoryRegion *mr, - RamDiscardSource *rds); +RamDiscardManager *ram_discard_manager_new(MemoryRegion *mr); + +/** + * ram_discard_manager_add_source: + * + * Register a #RamDiscardSource with the #RamDiscardManager. The manager + * aggregates state from all registered sources using AND semantics: a reg= ion + * is considered populated only if ALL sources report it as populated. + * + * If listeners are already registered, they will be notified about any + * regions that become discarded due to adding this source. Specifically, + * for each region that the new source reports as discarded, if all other + * sources reported it as populated, listeners receive a discard notificat= ion. + * + * If any listener rejects the notification (returns an error), previously + * notified listeners are rolled back with populate notifications and the + * source is not added. + * + * @rdm: the #RamDiscardManager + * @source: the #RamDiscardSource to add + * + * Returns: 0 on success, -EBUSY if @source is already registered, or a + * negative error code if a listener rejected the state change. + */ +int ram_discard_manager_add_source(RamDiscardManager *rdm, + RamDiscardSource *source); + +/** + * ram_discard_manager_del_source: + * + * Unregister a #RamDiscardSource from the #RamDiscardManager. + * + * If listeners are already registered, they will be notified about any + * regions that become populated due to removing this source. Specifically, + * for each region that the removed source reported as discarded, if all + * remaining sources report it as populated, listeners receive a populate + * notification. + * + * If any listener rejects the notification (returns an error), previously + * notified listeners are rolled back with discard notifications and the + * source is not removed. + * + * @rdm: the #RamDiscardManager + * @source: the #RamDiscardSource to remove + * + * Returns: 0 on success, -ENOENT if @source is not registered, or a + * negative error code if a listener rejected the state change. + */ +int ram_discard_manager_del_source(RamDiscardManager *rdm, + RamDiscardSource *source); + =20 uint64_t ram_discard_manager_get_min_granularity(const RamDiscardManager *= rdm, const MemoryRegion *mr); =20 +/** + * ram_discard_manager_is_populated: + * + * Check if the given memory region section is populated. + * If the manager has no sources, it is considered populated. + * + * @rdm: the #RamDiscardManager + * @section: the #MemoryRegionSection to check + * + * Returns: true if the section is populated, false otherwise. + */ bool ram_discard_manager_is_populated(const RamDiscardManager *rdm, const MemoryRegionSection *section); =20 /** * ram_discard_manager_replay_populated: * - * Iterate the given #MemoryRegionSection at minimum granularity, calling - * #RamDiscardSourceClass.is_populated for each chunk, and invoke @replay_= fn - * for each contiguous populated range. In case any call fails, no further - * calls are made. + * Call @replay_fn on regions that are populated in all sources. * * @rdm: the #RamDiscardManager * @section: the #MemoryRegionSection @@ -210,10 +276,7 @@ int ram_discard_manager_replay_populated(const RamDisc= ardManager *rdm, /** * ram_discard_manager_replay_discarded: * - * Iterate the given #MemoryRegionSection at minimum granularity, calling - * #RamDiscardSourceClass.is_populated for each chunk, and invoke @replay_= fn - * for each contiguous discarded range. In case any call fails, no further - * calls are made. + * Call @replay_fn on regions that are discarded in any sources. * * @rdm: the #RamDiscardManager * @section: the #MemoryRegionSection @@ -234,31 +297,61 @@ void ram_discard_manager_register_listener(RamDiscard= Manager *rdm, void ram_discard_manager_unregister_listener(RamDiscardManager *rdm, RamDiscardListener *rdl); =20 -/* - * Note: later refactoring should take the source into account and the man= ager - * should be able to aggregate multiple sources. +/** + * ram_discard_manager_notify_populate: + * + * Notify listeners that a region is about to be populated by a source. + * For multi-source aggregation, only notifies when all sources agree + * the region is populated (intersection). + * + * @rdm: the #RamDiscardManager + * @source: the #RamDiscardSource that is populating + * @offset: offset within the memory region + * @size: size of the region being populated + * + * Returns 0 on success, or a negative error if any listener rejects. */ int ram_discard_manager_notify_populate(RamDiscardManager *rdm, + RamDiscardSource *source, uint64_t offset, uint64_t size); =20 -/* - * Note: later refactoring should take the source into account and the man= ager - * should be able to aggregate multiple sources. +/** + * ram_discard_manager_notify_discard: + * + * Notify listeners that a region has been discarded by a source. + * For multi-source aggregation, always notifies immediately + * (union semantics - any source discarding makes region discarded). + * + * @rdm: the #RamDiscardManager + * @source: the #RamDiscardSource that is discarding + * @offset: offset within the memory region + * @size: size of the region being discarded */ void ram_discard_manager_notify_discard(RamDiscardManager *rdm, + RamDiscardSource *source, uint64_t offset, uint64_t size); =20 -/* - * Note: later refactoring should take the source into account and the man= ager - * should be able to aggregate multiple sources. +/** + * ram_discard_manager_notify_discard_all: + * + * Notify listeners that all regions have been discarded by a source. + * + * @rdm: the #RamDiscardManager + * @source: the #RamDiscardSource that is discarding */ -void ram_discard_manager_notify_discard_all(RamDiscardManager *rdm); +void ram_discard_manager_notify_discard_all(RamDiscardManager *rdm, + RamDiscardSource *source); =20 -/* +/** + * ram_discard_manager_replay_populated_to_listeners: + * * Replay populated sections to all registered listeners. + * For multi-source aggregation, only replays regions where all sources + * are populated (intersection). * - * Note: later refactoring should take the source into account and the man= ager - * should be able to aggregate multiple sources. + * @rdm: the #RamDiscardManager + * + * Returns 0 on success, or a negative error if any notification failed. */ int ram_discard_manager_replay_populated_to_listeners(RamDiscardManager *r= dm); =20 diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c index 2b67b2882d2..35e03ed7599 100644 --- a/hw/virtio/virtio-mem.c +++ b/hw/virtio/virtio-mem.c @@ -264,7 +264,8 @@ static void virtio_mem_notify_unplug(VirtIOMEM *vmem, u= int64_t offset, { RamDiscardManager *rdm =3D memory_region_get_ram_discard_manager(&vmem= ->memdev->mr); =20 - ram_discard_manager_notify_discard(rdm, offset, size); + ram_discard_manager_notify_discard(rdm, RAM_DISCARD_SOURCE(vmem), + offset, size); } =20 static int virtio_mem_notify_plug(VirtIOMEM *vmem, uint64_t offset, @@ -272,7 +273,8 @@ static int virtio_mem_notify_plug(VirtIOMEM *vmem, uint= 64_t offset, { RamDiscardManager *rdm =3D memory_region_get_ram_discard_manager(&vmem= ->memdev->mr); =20 - return ram_discard_manager_notify_populate(rdm, offset, size); + return ram_discard_manager_notify_populate(rdm, RAM_DISCARD_SOURCE(vme= m), + offset, size); } =20 static void virtio_mem_notify_unplug_all(VirtIOMEM *vmem) @@ -283,7 +285,7 @@ static void virtio_mem_notify_unplug_all(VirtIOMEM *vme= m) return; } =20 - ram_discard_manager_notify_discard_all(rdm); + ram_discard_manager_notify_discard_all(rdm, RAM_DISCARD_SOURCE(vmem)); } =20 static bool virtio_mem_is_range_plugged(const VirtIOMEM *vmem, diff --git a/system/memory.c b/system/memory.c index 8b46cb87838..8a4cb7b59ac 100644 --- a/system/memory.c +++ b/system/memory.c @@ -2109,21 +2109,22 @@ int memory_region_add_ram_discard_source(MemoryRegi= on *mr, RamDiscardSource *source) { g_assert(memory_region_is_ram(mr)); - if (mr->rdm) { - return -EBUSY; + + if (!mr->rdm) { + mr->rdm =3D ram_discard_manager_new(mr); } =20 - mr->rdm =3D ram_discard_manager_new(mr, RAM_DISCARD_SOURCE(source)); - return 0; + return ram_discard_manager_add_source(mr->rdm, source); } =20 void memory_region_del_ram_discard_source(MemoryRegion *mr, RamDiscardSource *source) { - g_assert(mr->rdm->rds =3D=3D source); + g_assert(mr->rdm); + + ram_discard_manager_del_source(mr->rdm, source); =20 - object_unref(mr->rdm); - mr->rdm =3D NULL; + /* if there is no source and no listener left, we could free rdm */ } =20 /* Called with rcu_read_lock held. */ diff --git a/system/ram-block-attributes.c b/system/ram-block-attributes.c index 718c7075cec..59ec7a28eb0 100644 --- a/system/ram-block-attributes.c +++ b/system/ram-block-attributes.c @@ -90,7 +90,8 @@ ram_block_attributes_notify_discard(RamBlockAttributes *a= ttr, { RamDiscardManager *rdm =3D memory_region_get_ram_discard_manager(attr-= >ram_block->mr); =20 - ram_discard_manager_notify_discard(rdm, offset, size); + ram_discard_manager_notify_discard(rdm, RAM_DISCARD_SOURCE(attr), + offset, size); } =20 static int @@ -99,7 +100,8 @@ ram_block_attributes_notify_populate(RamBlockAttributes = *attr, { RamDiscardManager *rdm =3D memory_region_get_ram_discard_manager(attr-= >ram_block->mr); =20 - return ram_discard_manager_notify_populate(rdm, offset, size); + return ram_discard_manager_notify_populate(rdm, RAM_DISCARD_SOURCE(att= r), + offset, size); } =20 int ram_block_attributes_state_change(RamBlockAttributes *attr, diff --git a/system/ram-discard-manager.c b/system/ram-discard-manager.c index 25beb052a1e..5592bfd3486 100644 --- a/system/ram-discard-manager.c +++ b/system/ram-discard-manager.c @@ -7,6 +7,7 @@ =20 #include "qemu/osdep.h" #include "qemu/error-report.h" +#include "qemu/queue.h" #include "system/memory.h" =20 static uint64_t ram_discard_source_get_min_granularity(const RamDiscardSou= rce *rds, @@ -28,20 +29,21 @@ static bool ram_discard_source_is_populated(const RamDi= scardSource *rds, } =20 /* - * Iterate the section at source granularity, aggregating consecutive chun= ks - * with matching populated state, and call replay_fn for each run. + * Iterate a single source's populated or discarded regions and call + * replay_fn for each contiguous run. */ -static int replay_by_populated_state(const RamDiscardManager *rdm, - const MemoryRegionSection *section, - bool replay_populated, - ReplayRamDiscardState replay_fn, - void *opaque) +static int replay_source_by_state(const RamDiscardSource *source, + const MemoryRegion *mr, + const MemoryRegionSection *section, + bool replay_populated, + ReplayRamDiscardState replay_fn, + void *opaque) { uint64_t granularity, offset, size, end, pos, run_start; bool in_run =3D false; int ret =3D 0; =20 - granularity =3D ram_discard_source_get_min_granularity(rdm->rds, rdm->= mr); + granularity =3D ram_discard_source_get_min_granularity(source, mr); offset =3D section->offset_within_region; size =3D int128_get64(section->size); end =3D offset + size; @@ -55,7 +57,7 @@ static int replay_by_populated_state(const RamDiscardMana= ger *rdm, .offset_within_region =3D pos, .size =3D int128_make64(granularity), }; - bool populated =3D ram_discard_source_is_populated(rdm->rds, &chun= k); + bool populated =3D ram_discard_source_is_populated(source, &chunk); =20 if (populated =3D=3D replay_populated) { if (!in_run) { @@ -88,28 +90,338 @@ static int replay_by_populated_state(const RamDiscardM= anager *rdm, return ret; } =20 -RamDiscardManager *ram_discard_manager_new(MemoryRegion *mr, - RamDiscardSource *rds) +RamDiscardManager *ram_discard_manager_new(MemoryRegion *mr) { RamDiscardManager *rdm; =20 rdm =3D RAM_DISCARD_MANAGER(object_new(TYPE_RAM_DISCARD_MANAGER)); - rdm->rds =3D rds; rdm->mr =3D mr; - QLIST_INIT(&rdm->rdl_list); return rdm; } =20 +static void ram_discard_manager_update_granularity(RamDiscardManager *rdm) +{ + RamDiscardSourceEntry *entry; + uint64_t granularity =3D 0; + + QLIST_FOREACH(entry, &rdm->source_list, next) { + uint64_t src_granularity; + + src_granularity =3D + ram_discard_source_get_min_granularity(entry->rds, rdm->mr); + g_assert(src_granularity !=3D 0); + if (granularity =3D=3D 0) { + granularity =3D src_granularity; + } else { + granularity =3D MIN(granularity, src_granularity); + } + } + rdm->min_granularity =3D granularity; +} + +static RamDiscardSourceEntry * +ram_discard_manager_find_source(RamDiscardManager *rdm, RamDiscardSource *= rds) +{ + RamDiscardSourceEntry *entry; + + QLIST_FOREACH(entry, &rdm->source_list, next) { + if (entry->rds =3D=3D rds) { + return entry; + } + } + return NULL; +} + +static int rdl_populate_cb(const MemoryRegionSection *section, void *opaqu= e) +{ + RamDiscardListener *rdl =3D opaque; + MemoryRegionSection tmp =3D *rdl->section; + + g_assert(section->mr =3D=3D rdl->section->mr); + + if (!memory_region_section_intersect_range(&tmp, + section->offset_within_regi= on, + int128_get64(section->size)= )) { + return 0; + } + + return rdl->notify_populate(rdl, &tmp); +} + +static int rdl_discard_cb(const MemoryRegionSection *section, void *opaque) +{ + RamDiscardListener *rdl =3D opaque; + MemoryRegionSection tmp =3D *rdl->section; + + g_assert(section->mr =3D=3D rdl->section->mr); + + if (!memory_region_section_intersect_range(&tmp, + section->offset_within_regi= on, + int128_get64(section->size)= )) { + return 0; + } + + rdl->notify_discard(rdl, &tmp); + return 0; +} + +static bool rdm_is_all_populated_skip(const RamDiscardManager *rdm, + const MemoryRegionSection *section, + const RamDiscardSource *skip_source) +{ + RamDiscardSourceEntry *entry; + + QLIST_FOREACH(entry, &rdm->source_list, next) { + if (skip_source && entry->rds =3D=3D skip_source) { + continue; + } + if (!ram_discard_source_is_populated(entry->rds, section)) { + return false; + } + } + return true; +} + +typedef struct SourceNotifyCtx { + RamDiscardManager *rdm; + RamDiscardListener *rdl; + RamDiscardSource *source; /* added or removed */ +} SourceNotifyCtx; + +/* + * Unified helper to replay regions based on populated state. + * If replay_populated is true: replay regions where ALL sources are popul= ated. + * If replay_populated is false: replay regions where ANY source is discar= ded. + */ +static int replay_by_populated_state(const RamDiscardManager *rdm, + const MemoryRegionSection *section, + const RamDiscardSource *skip_source, + bool replay_populated, + ReplayRamDiscardState replay_fn, + void *user_opaque) +{ + uint64_t granularity =3D rdm->min_granularity; + uint64_t offset, end_offset; + uint64_t run_start =3D 0; + bool in_run =3D false; + int ret =3D 0; + + if (QLIST_EMPTY(&rdm->source_list)) { + if (replay_populated) { + return replay_fn(section, user_opaque); + } + return 0; + } + + g_assert(granularity !=3D 0); + + offset =3D section->offset_within_region; + end_offset =3D offset + int128_get64(section->size); + + while (offset < end_offset) { + MemoryRegionSection subsection =3D { + .mr =3D section->mr, + .offset_within_region =3D offset, + .size =3D int128_make64(MIN(granularity, end_offset - offset)), + }; + bool all_populated; + bool included; + + all_populated =3D rdm_is_all_populated_skip(rdm, &subsection, + skip_source); + included =3D replay_populated ? all_populated : !all_populated; + + if (included) { + if (!in_run) { + run_start =3D offset; + in_run =3D true; + } + } else { + if (in_run) { + MemoryRegionSection run_section =3D { + .mr =3D section->mr, + .offset_within_region =3D run_start, + .size =3D int128_make64(offset - run_start), + }; + ret =3D replay_fn(&run_section, user_opaque); + if (ret) { + return ret; + } + in_run =3D false; + } + } + if (granularity > end_offset - offset) { + break; + } + offset +=3D granularity; + } + + if (in_run) { + MemoryRegionSection run_section =3D { + .mr =3D section->mr, + .offset_within_region =3D run_start, + .size =3D int128_make64(end_offset - run_start), + }; + ret =3D replay_fn(&run_section, user_opaque); + } + + return ret; +} + +static int add_source_check_discard_cb(const MemoryRegionSection *section, + void *opaque) +{ + SourceNotifyCtx *ctx =3D opaque; + + return replay_by_populated_state(ctx->rdm, section, ctx->source, true, + rdl_discard_cb, ctx->rdl); +} + +static int del_source_check_populate_cb(const MemoryRegionSection *section, + void *opaque) +{ + SourceNotifyCtx *ctx =3D opaque; + + return replay_by_populated_state(ctx->rdm, section, ctx->source, true, + rdl_populate_cb, ctx->rdl); +} + +int ram_discard_manager_add_source(RamDiscardManager *rdm, + RamDiscardSource *source) +{ + RamDiscardSourceEntry *entry; + RamDiscardListener *rdl, *rdl2; + int ret =3D 0; + + if (ram_discard_manager_find_source(rdm, source)) { + return -EBUSY; + } + + /* + * If there are existing listeners, notify them about regions that + * become discarded due to adding this source. Only notify for regions + * that were previously populated (all other sources agreed). + */ + QLIST_FOREACH(rdl, &rdm->rdl_list, next) { + SourceNotifyCtx ctx =3D { + .rdm =3D rdm, + .rdl =3D rdl, + /* no need to set source */ + }; + ret =3D replay_source_by_state(source, rdm->mr, rdl->section, + false, + add_source_check_discard_cb, &ctx); + if (ret) { + break; + } + } + if (ret) { + QLIST_FOREACH(rdl2, &rdm->rdl_list, next) { + SourceNotifyCtx ctx =3D { + .rdm =3D rdm, + .rdl =3D rdl2, + }; + replay_source_by_state(source, rdm->mr, rdl2->section, + false, + del_source_check_populate_cb, + &ctx); + if (rdl =3D=3D rdl2) { + break; + } + } + + return ret; + } + + entry =3D g_new0(RamDiscardSourceEntry, 1); + entry->rds =3D source; + QLIST_INSERT_HEAD(&rdm->source_list, entry, next); + + ram_discard_manager_update_granularity(rdm); + + return ret; +} + +int ram_discard_manager_del_source(RamDiscardManager *rdm, + RamDiscardSource *source) +{ + RamDiscardSourceEntry *entry; + RamDiscardListener *rdl, *rdl2; + int ret =3D 0; + + entry =3D ram_discard_manager_find_source(rdm, source); + if (!entry) { + return -ENOENT; + } + + /* + * If there are existing listeners, check if any regions become + * populated due to removing this source. + */ + QLIST_FOREACH(rdl, &rdm->rdl_list, next) { + SourceNotifyCtx ctx =3D { + .rdm =3D rdm, + .rdl =3D rdl, + .source =3D source, + }; + /* + * From the previously discarded regions, check if any + * regions become populated. + */ + ret =3D replay_source_by_state(source, rdm->mr, rdl->section, + false, + del_source_check_populate_cb, + &ctx); + if (ret) { + break; + } + } + if (ret) { + QLIST_FOREACH(rdl2, &rdm->rdl_list, next) { + SourceNotifyCtx ctx =3D { + .rdm =3D rdm, + .rdl =3D rdl2, + .source =3D source, + }; + replay_source_by_state(source, rdm->mr, rdl2->section, + false, + add_source_check_discard_cb, + &ctx); + if (rdl =3D=3D rdl2) { + break; + } + } + + return ret; + } + + QLIST_REMOVE(entry, next); + g_free(entry); + ram_discard_manager_update_granularity(rdm); + return ret; +} + uint64_t ram_discard_manager_get_min_granularity(const RamDiscardManager *= rdm, const MemoryRegion *mr) { - return ram_discard_source_get_min_granularity(rdm->rds, mr); + g_assert(mr =3D=3D rdm->mr); + return rdm->min_granularity; } =20 +/* + * Aggregated query: returns true only if ALL sources report populated (AN= D). + */ bool ram_discard_manager_is_populated(const RamDiscardManager *rdm, const MemoryRegionSection *section) { - return ram_discard_source_is_populated(rdm->rds, section); + RamDiscardSourceEntry *entry; + + QLIST_FOREACH(entry, &rdm->source_list, next) { + if (!ram_discard_source_is_populated(entry->rds, section)) { + return false; + } + } + return true; } =20 int ram_discard_manager_replay_populated(const RamDiscardManager *rdm, @@ -117,7 +429,8 @@ int ram_discard_manager_replay_populated(const RamDisca= rdManager *rdm, ReplayRamDiscardState replay_fn, void *opaque) { - return replay_by_populated_state(rdm, section, true, replay_fn, opaque= ); + return replay_by_populated_state(rdm, section, NULL, true, + replay_fn, opaque); } =20 int ram_discard_manager_replay_discarded(const RamDiscardManager *rdm, @@ -125,14 +438,17 @@ int ram_discard_manager_replay_discarded(const RamDis= cardManager *rdm, ReplayRamDiscardState replay_fn, void *opaque) { - return replay_by_populated_state(rdm, section, false, replay_fn, opaqu= e); + return replay_by_populated_state(rdm, section, NULL, false, + replay_fn, opaque); } =20 static void ram_discard_manager_initfn(Object *obj) { RamDiscardManager *rdm =3D RAM_DISCARD_MANAGER(obj); =20 + QLIST_INIT(&rdm->source_list); QLIST_INIT(&rdm->rdl_list); + rdm->min_granularity =3D 0; } =20 static void ram_discard_manager_finalize(Object *obj) @@ -140,74 +456,91 @@ static void ram_discard_manager_finalize(Object *obj) RamDiscardManager *rdm =3D RAM_DISCARD_MANAGER(obj); =20 g_assert(QLIST_EMPTY(&rdm->rdl_list)); + g_assert(QLIST_EMPTY(&rdm->source_list)); } =20 int ram_discard_manager_notify_populate(RamDiscardManager *rdm, + RamDiscardSource *source, uint64_t offset, uint64_t size) { RamDiscardListener *rdl, *rdl2; + MemoryRegionSection section =3D { + .mr =3D rdm->mr, + .offset_within_region =3D offset, + .size =3D int128_make64(size), + }; int ret =3D 0; =20 - QLIST_FOREACH(rdl, &rdm->rdl_list, next) { - MemoryRegionSection tmp =3D *rdl->section; + g_assert(ram_discard_manager_find_source(rdm, source)); =20 - if (!memory_region_section_intersect_range(&tmp, offset, size)) { - continue; - } - ret =3D rdl->notify_populate(rdl, &tmp); + /* + * Only notify about regions that are populated in ALL sources. + * replay_by_populated_state checks all sources including the one that + * just populated. + */ + QLIST_FOREACH(rdl, &rdm->rdl_list, next) { + ret =3D replay_by_populated_state(rdm, §ion, NULL, true, + rdl_populate_cb, rdl); if (ret) { break; } } =20 if (ret) { - /* Notify all already-notified listeners about discard. */ + /* + * Rollback: notify discard for listeners we already notified, + * including the failing listener which may have been partially + * notified. Listeners must handle discard notifications for + * regions they didn't receive populate notifications for. + */ QLIST_FOREACH(rdl2, &rdm->rdl_list, next) { - MemoryRegionSection tmp =3D *rdl2->section; - + replay_by_populated_state(rdm, §ion, NULL, true, + rdl_discard_cb, rdl2); if (rdl2 =3D=3D rdl) { break; } - if (!memory_region_section_intersect_range(&tmp, offset, size)= ) { - continue; - } - rdl2->notify_discard(rdl2, &tmp); } } return ret; } =20 void ram_discard_manager_notify_discard(RamDiscardManager *rdm, + RamDiscardSource *source, uint64_t offset, uint64_t size) { RamDiscardListener *rdl; - + MemoryRegionSection section =3D { + .mr =3D rdm->mr, + .offset_within_region =3D offset, + .size =3D int128_make64(size), + }; + + g_assert(ram_discard_manager_find_source(rdm, source)); + + /* + * Only notify about ranges that were aggregately populated before this + * source's discard. Since the source has already updated its state, + * we use replay_by_populated_state with this source skipped - it will + * replay only the ranges where all OTHER sources are populated. + */ QLIST_FOREACH(rdl, &rdm->rdl_list, next) { - MemoryRegionSection tmp =3D *rdl->section; - - if (!memory_region_section_intersect_range(&tmp, offset, size)) { - continue; - } - rdl->notify_discard(rdl, &tmp); + replay_by_populated_state(rdm, §ion, source, true, + rdl_discard_cb, rdl); } } =20 -void ram_discard_manager_notify_discard_all(RamDiscardManager *rdm) +void ram_discard_manager_notify_discard_all(RamDiscardManager *rdm, + RamDiscardSource *source) { RamDiscardListener *rdl; =20 + g_assert(ram_discard_manager_find_source(rdm, source)); + QLIST_FOREACH(rdl, &rdm->rdl_list, next) { rdl->notify_discard(rdl, rdl->section); } } =20 -static int rdm_populate_cb(const MemoryRegionSection *section, void *opaqu= e) -{ - RamDiscardListener *rdl =3D opaque; - - return rdl->notify_populate(rdl, section); -} - void ram_discard_manager_register_listener(RamDiscardManager *rdm, RamDiscardListener *rdl, MemoryRegionSection *section) @@ -220,7 +553,7 @@ void ram_discard_manager_register_listener(RamDiscardMa= nager *rdm, QLIST_INSERT_HEAD(&rdm->rdl_list, rdl, next); =20 ret =3D ram_discard_manager_replay_populated(rdm, rdl->section, - rdm_populate_cb, rdl); + rdl_populate_cb, rdl); if (ret) { error_report("%s: Replaying populated ranges failed: %s", __func__, strerror(-ret)); @@ -246,7 +579,7 @@ int ram_discard_manager_replay_populated_to_listeners(R= amDiscardManager *rdm) =20 QLIST_FOREACH(rdl, &rdm->rdl_list, next) { ret =3D ram_discard_manager_replay_populated(rdm, rdl->section, - rdm_populate_cb, rdl); + rdl_populate_cb, rdl); if (ret) { break; } --=20 2.53.0