From nobody Tue Feb 10 05:39:42 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1770199730; cv=none; d=zohomail.com; s=zohoarc; b=M9jPMNDviZQs/Nt8rntXnkBYczCJOIgAZMFd+OsoW4ciu4bcX1x9vFdoB1/culAnldG9hu0D+Z5hVcfLkYsSwT16xrJwZtZAzqMbJm0OrGnBkrb8qn3RAz3bzhrMSlNZFJqpEXIsMGT9F8Js4vlisthvFYLxuSVoAe1ZDsIFWfM= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1770199730; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=7D5YNrPev9bmDU5MOXs3l1kz8g6I1VOyIkroPq6oxTM=; b=mFqioUy7Cnp0wG/a0HLIvMG1l6XiMcPzDmsduksX8NezoRP72gzBLwLP3/N4sFTJoOwOt5SDIezS4/uVdfgTc9gtbn/lZ4gpiFi+11U/ojJHDLYYSXcS71h6CZCSWdIe99R1MX+q8iMdunaP8ppa/Y/eZDMwybCzIC7znm1crBk= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1770199730723597.8051846520952; Wed, 4 Feb 2026 02:08:50 -0800 (PST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1vnZo6-0006gu-CO; Wed, 04 Feb 2026 05:08:22 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1vnZo2-0006gd-Op for qemu-devel@nongnu.org; Wed, 04 Feb 2026 05:08:19 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1vnZo0-0005H6-2Y for qemu-devel@nongnu.org; Wed, 04 Feb 2026 05:08:18 -0500 Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-372-8EE2gPAXOFmQNdBfKw4zdA-1; Wed, 04 Feb 2026 05:08:11 -0500 Received: from mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 4BFE719560B0; Wed, 4 Feb 2026 10:08:10 +0000 (UTC) Received: from localhost (unknown [10.45.242.6]) by mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 2D46930001A5; Wed, 4 Feb 2026 10:08:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1770199695; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7D5YNrPev9bmDU5MOXs3l1kz8g6I1VOyIkroPq6oxTM=; b=f36kKclzW5T4dOO7H3NTjP6juV2UAiefPmaaoEOFibLeUNODlopQF6H26iAWXfk+1z0v9S UO0iOBbnYpdcDuSUO70QRVFeQDYdfDZmEsP+fWdwm+tza1xvA38HRvcD7nAXg8EFvzeQZt erMfHbieR/Q7ZWhqcsxYUsLajxZ13Y4= X-MC-Unique: 8EE2gPAXOFmQNdBfKw4zdA-1 X-Mimecast-MFC-AGG-ID: 8EE2gPAXOFmQNdBfKw4zdA_1770199690 From: marcandre.lureau@redhat.com To: qemu-devel@nongnu.org Cc: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= , kvm@vger.kernel.org, Alex Williamson , Paolo Bonzini , Ben Chaney , "Michael S. Tsirkin" , David Hildenbrand , Fabiano Rosas , Peter Xu , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= , Mark Kanda , =?UTF-8?q?Marc-Andr=C3=A9=20Lureau?= Subject: [PATCH 09/10] memory: implement RamDiscardManager multi-source aggregation Date: Wed, 4 Feb 2026 14:07:05 +0400 Message-ID: <20260204100708.724800-10-marcandre.lureau@redhat.com> In-Reply-To: <20260204100708.724800-1-marcandre.lureau@redhat.com> References: <20260204100708.724800-1-marcandre.lureau@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.4 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=marcandre.lureau@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: qemu development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1770199732134154100 From: Marc-Andr=C3=A9 Lureau Refactor RamDiscardManager to aggregate multiple RamDiscardSource instances. This enables scenarios where multiple components (e.g., virtio-mem and RamBlockAttributes) can coordinate memory discard state for the same memory region. The aggregation uses: - Populated: ALL sources populated - Discarded: ANY source discarded When a source is added with existing listeners, they are notified about regions that become discarded. When a source is removed, listeners are notified about regions that become populated. Signed-off-by: Marc-Andr=C3=A9 Lureau --- include/system/ram-discard-manager.h | 139 +++++++-- hw/virtio/virtio-mem.c | 8 +- system/memory.c | 15 +- system/ram-block-attributes.c | 6 +- system/ram-discard-manager.c | 434 ++++++++++++++++++++++++--- 5 files changed, 527 insertions(+), 75 deletions(-) diff --git a/include/system/ram-discard-manager.h b/include/system/ram-disc= ard-manager.h index b188e09a30f..1c5a6f55833 100644 --- a/include/system/ram-discard-manager.h +++ b/include/system/ram-discard-manager.h @@ -206,28 +206,96 @@ struct RamDiscardSourceClass { * becoming discarded in a different granularity than it was populated and= the * other way around. */ + +typedef struct RamDiscardSourceEntry RamDiscardSourceEntry; + +struct RamDiscardSourceEntry { + RamDiscardSource *rds; + QLIST_ENTRY(RamDiscardSourceEntry) next; +}; + struct RamDiscardManager { Object parent; =20 - RamDiscardSource *rds; - MemoryRegion *mr; + struct MemoryRegion *mr; + QLIST_HEAD(, RamDiscardSourceEntry) source_list; + uint64_t min_granularity; QLIST_HEAD(, RamDiscardListener) rdl_list; }; =20 -RamDiscardManager *ram_discard_manager_new(MemoryRegion *mr, - RamDiscardSource *rds); +RamDiscardManager *ram_discard_manager_new(MemoryRegion *mr); + +/** + * ram_discard_manager_add_source: + * + * Register a #RamDiscardSource with the #RamDiscardManager. The manager + * aggregates state from all registered sources using AND semantics: a reg= ion + * is considered populated only if ALL sources report it as populated. + * + * If listeners are already registered, they will be notified about any + * regions that become discarded due to adding this source. Specifically, + * for each region that the new source reports as discarded, if all other + * sources reported it as populated, listeners receive a discard notificat= ion. + * + * If any listener rejects the notification (returns an error), previously + * notified listeners are rolled back with populate notifications and the + * source is not added. + * + * @rdm: the #RamDiscardManager + * @source: the #RamDiscardSource to add + * + * Returns: 0 on success, -EBUSY if @source is already registered, or a + * negative error code if a listener rejected the state change. + */ +int ram_discard_manager_add_source(RamDiscardManager *rdm, + RamDiscardSource *source); + +/** + * ram_discard_manager_del_source: + * + * Unregister a #RamDiscardSource from the #RamDiscardManager. + * + * If listeners are already registered, they will be notified about any + * regions that become populated due to removing this source. Specifically, + * for each region that the removed source reported as discarded, if all + * remaining sources report it as populated, listeners receive a populate + * notification. + * + * If any listener rejects the notification (returns an error), previously + * notified listeners are rolled back with discard notifications and the + * source is not removed. + * + * @rdm: the #RamDiscardManager + * @source: the #RamDiscardSource to remove + * + * Returns: 0 on success, -ENOENT if @source is not registered, or a + * negative error code if a listener rejected the state change. + */ +int ram_discard_manager_del_source(RamDiscardManager *rdm, + RamDiscardSource *source); + =20 uint64_t ram_discard_manager_get_min_granularity(const RamDiscardManager *= rdm, const MemoryRegion *mr); =20 +/** + * ram_discard_manager_is_populated: + * + * Check if the given memory region section is populated. + * If the manager has no sources, it is considered populated. + * + * @rdm: the #RamDiscardManager + * @section: the #MemoryRegionSection to check + * + * Returns: true if the section is populated, false otherwise. + */ bool ram_discard_manager_is_populated(const RamDiscardManager *rdm, const MemoryRegionSection *section); =20 /** * ram_discard_manager_replay_populated: * - * A wrapper to call the #RamDiscardSourceClass.replay_populated callback - * of the #RamDiscardSource sources. + * Call @replay_fn on regions that are populated in all sources. * * @rdm: the #RamDiscardManager * @section: the #MemoryRegionSection @@ -244,8 +312,7 @@ int ram_discard_manager_replay_populated(const RamDisca= rdManager *rdm, /** * ram_discard_manager_replay_discarded: * - * A wrapper to call the #RamDiscardSourceClass.replay_discarded callback - * of the #RamDiscardSource sources. + * Call @replay_fn on regions that are discarded in any sources. * * @rdm: the #RamDiscardManager * @section: the #MemoryRegionSection @@ -266,31 +333,61 @@ void ram_discard_manager_register_listener(RamDiscard= Manager *rdm, void ram_discard_manager_unregister_listener(RamDiscardManager *rdm, RamDiscardListener *rdl); =20 -/* - * Note: later refactoring should take the source into account and the man= ager - * should be able to aggregate multiple sources. +/** + * ram_discard_manager_notify_populate: + * + * Notify listeners that a region is about to be populated by a source. + * For multi-source aggregation, only notifies when all sources agree + * the region is populated (intersection). + * + * @rdm: the #RamDiscardManager + * @source: the #RamDiscardSource that is populating + * @offset: offset within the memory region + * @size: size of the region being populated + * + * Returns 0 on success, or a negative error if any listener rejects. */ int ram_discard_manager_notify_populate(RamDiscardManager *rdm, + RamDiscardSource *source, uint64_t offset, uint64_t size); =20 -/* - * Note: later refactoring should take the source into account and the man= ager - * should be able to aggregate multiple sources. +/** + * ram_discard_manager_notify_discard: + * + * Notify listeners that a region has been discarded by a source. + * For multi-source aggregation, always notifies immediately + * (union semantics - any source discarding makes region discarded). + * + * @rdm: the #RamDiscardManager + * @source: the #RamDiscardSource that is discarding + * @offset: offset within the memory region + * @size: size of the region being discarded */ void ram_discard_manager_notify_discard(RamDiscardManager *rdm, + RamDiscardSource *source, uint64_t offset, uint64_t size); =20 -/* - * Note: later refactoring should take the source into account and the man= ager - * should be able to aggregate multiple sources. +/** + * ram_discard_manager_notify_discard_all: + * + * Notify listeners that all regions have been discarded by a source. + * + * @rdm: the #RamDiscardManager + * @source: the #RamDiscardSource that is discarding */ -void ram_discard_manager_notify_discard_all(RamDiscardManager *rdm); +void ram_discard_manager_notify_discard_all(RamDiscardManager *rdm, + RamDiscardSource *source); =20 -/* +/** + * ram_discard_manager_replay_populated_to_listeners: + * * Replay populated sections to all registered listeners. + * For multi-source aggregation, only replays regions where all sources + * are populated (intersection). * - * Note: later refactoring should take the source into account and the man= ager - * should be able to aggregate multiple sources. + * @rdm: the #RamDiscardManager + * + * Returns 0 on success, or a negative error if any notification failed. */ int ram_discard_manager_replay_populated_to_listeners(RamDiscardManager *r= dm); =20 diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c index ec165503205..185d04d498c 100644 --- a/hw/virtio/virtio-mem.c +++ b/hw/virtio/virtio-mem.c @@ -330,7 +330,8 @@ static void virtio_mem_notify_unplug(VirtIOMEM *vmem, u= int64_t offset, { RamDiscardManager *rdm =3D memory_region_get_ram_discard_manager(&vmem= ->memdev->mr); =20 - ram_discard_manager_notify_discard(rdm, offset, size); + ram_discard_manager_notify_discard(rdm, RAM_DISCARD_SOURCE(vmem), + offset, size); } =20 static int virtio_mem_notify_plug(VirtIOMEM *vmem, uint64_t offset, @@ -338,7 +339,8 @@ static int virtio_mem_notify_plug(VirtIOMEM *vmem, uint= 64_t offset, { RamDiscardManager *rdm =3D memory_region_get_ram_discard_manager(&vmem= ->memdev->mr); =20 - return ram_discard_manager_notify_populate(rdm, offset, size); + return ram_discard_manager_notify_populate(rdm, RAM_DISCARD_SOURCE(vme= m), + offset, size); } =20 static void virtio_mem_notify_unplug_all(VirtIOMEM *vmem) @@ -349,7 +351,7 @@ static void virtio_mem_notify_unplug_all(VirtIOMEM *vme= m) return; } =20 - ram_discard_manager_notify_discard_all(rdm); + ram_discard_manager_notify_discard_all(rdm, RAM_DISCARD_SOURCE(vmem)); } =20 static bool virtio_mem_is_range_plugged(const VirtIOMEM *vmem, diff --git a/system/memory.c b/system/memory.c index 8b46cb87838..8a4cb7b59ac 100644 --- a/system/memory.c +++ b/system/memory.c @@ -2109,21 +2109,22 @@ int memory_region_add_ram_discard_source(MemoryRegi= on *mr, RamDiscardSource *source) { g_assert(memory_region_is_ram(mr)); - if (mr->rdm) { - return -EBUSY; + + if (!mr->rdm) { + mr->rdm =3D ram_discard_manager_new(mr); } =20 - mr->rdm =3D ram_discard_manager_new(mr, RAM_DISCARD_SOURCE(source)); - return 0; + return ram_discard_manager_add_source(mr->rdm, source); } =20 void memory_region_del_ram_discard_source(MemoryRegion *mr, RamDiscardSource *source) { - g_assert(mr->rdm->rds =3D=3D source); + g_assert(mr->rdm); + + ram_discard_manager_del_source(mr->rdm, source); =20 - object_unref(mr->rdm); - mr->rdm =3D NULL; + /* if there is no source and no listener left, we could free rdm */ } =20 /* Called with rcu_read_lock held. */ diff --git a/system/ram-block-attributes.c b/system/ram-block-attributes.c index e921e09f5b3..e7db0e72e77 100644 --- a/system/ram-block-attributes.c +++ b/system/ram-block-attributes.c @@ -218,7 +218,8 @@ ram_block_attributes_notify_discard(RamBlockAttributes = *attr, { RamDiscardManager *rdm =3D memory_region_get_ram_discard_manager(attr-= >ram_block->mr); =20 - ram_discard_manager_notify_discard(rdm, offset, size); + ram_discard_manager_notify_discard(rdm, RAM_DISCARD_SOURCE(attr), + offset, size); } =20 static int @@ -227,7 +228,8 @@ ram_block_attributes_notify_populate(RamBlockAttributes= *attr, { RamDiscardManager *rdm =3D memory_region_get_ram_discard_manager(attr-= >ram_block->mr); =20 - return ram_discard_manager_notify_populate(rdm, offset, size); + return ram_discard_manager_notify_populate(rdm, RAM_DISCARD_SOURCE(att= r), + offset, size); } =20 int ram_block_attributes_state_change(RamBlockAttributes *attr, diff --git a/system/ram-discard-manager.c b/system/ram-discard-manager.c index 1c9ff7fda58..c48ac4c66d6 100644 --- a/system/ram-discard-manager.c +++ b/system/ram-discard-manager.c @@ -7,6 +7,7 @@ =20 #include "qemu/osdep.h" #include "qemu/error-report.h" +#include "qemu/queue.h" #include "system/memory.h" =20 static uint64_t ram_discard_source_get_min_granularity(const RamDiscardSou= rce *rds, @@ -49,28 +50,341 @@ static int ram_discard_source_replay_discarded(const R= amDiscardSource *rds, return rdsc->replay_discarded(rds, section, replay_fn, opaque); } =20 -RamDiscardManager *ram_discard_manager_new(MemoryRegion *mr, - RamDiscardSource *rds) +RamDiscardManager *ram_discard_manager_new(MemoryRegion *mr) { RamDiscardManager *rdm; =20 rdm =3D RAM_DISCARD_MANAGER(object_new(TYPE_RAM_DISCARD_MANAGER)); - rdm->rds =3D rds; rdm->mr =3D mr; - QLIST_INIT(&rdm->rdl_list); return rdm; } =20 +static void ram_discard_manager_update_granularity(RamDiscardManager *rdm) +{ + RamDiscardSourceEntry *entry; + uint64_t granularity =3D 0; + + QLIST_FOREACH(entry, &rdm->source_list, next) { + uint64_t src_granularity; + + src_granularity =3D ram_discard_source_get_min_granularity(entry->= rds, rdm->mr); + g_assert(src_granularity !=3D 0); + if (granularity =3D=3D 0) { + granularity =3D src_granularity; + } else { + granularity =3D MIN(granularity, src_granularity); + } + } + rdm->min_granularity =3D granularity; +} + +static RamDiscardSourceEntry * +ram_discard_manager_find_source(RamDiscardManager *rdm, RamDiscardSource *= rds) +{ + RamDiscardSourceEntry *entry; + + QLIST_FOREACH(entry, &rdm->source_list, next) { + if (entry->rds =3D=3D rds) { + return entry; + } + } + return NULL; +} + +static int rdl_populate_cb(const MemoryRegionSection *section, void *opaqu= e) +{ + RamDiscardListener *rdl =3D opaque; + MemoryRegionSection tmp =3D *rdl->section; + + g_assert(section->mr =3D=3D rdl->section->mr); + + if (!memory_region_section_intersect_range(&tmp, + section->offset_within_regi= on, + int128_get64(section->size)= )) { + return 0; + } + + return rdl->notify_populate(rdl, &tmp); +} + +static int rdl_discard_cb(const MemoryRegionSection *section, void *opaque) +{ + RamDiscardListener *rdl =3D opaque; + MemoryRegionSection tmp =3D *rdl->section; + + g_assert(section->mr =3D=3D rdl->section->mr); + + if (!memory_region_section_intersect_range(&tmp, + section->offset_within_regi= on, + int128_get64(section->size)= )) { + return 0; + } + + rdl->notify_discard(rdl, &tmp); + return 0; +} + +static bool rdm_is_all_populated_skip(const RamDiscardManager *rdm, + const MemoryRegionSection *section, + const RamDiscardSource *skip_source) +{ + RamDiscardSourceEntry *entry; + + QLIST_FOREACH(entry, &rdm->source_list, next) { + if (skip_source && entry->rds =3D=3D skip_source) { + continue; + } + if (!ram_discard_source_is_populated(entry->rds, section)) { + return false; + } + } + return true; +} + +typedef struct SourceNotifyCtx { + RamDiscardManager *rdm; + RamDiscardListener *rdl; + RamDiscardSource *source; /* added or removed */ +} SourceNotifyCtx; + +/* + * Unified helper to replay regions based on populated state. + * If replay_populated is true: replay regions where ALL sources are popul= ated. + * If replay_populated is false: replay regions where ANY source is discar= ded. + */ +static int replay_by_populated_state(const RamDiscardManager *rdm, + const MemoryRegionSection *section, + const RamDiscardSource *skip_source, + bool replay_populated, + ReplayRamDiscardState replay_fn, + void *user_opaque) +{ + uint64_t granularity =3D rdm->min_granularity; + uint64_t offset, end_offset; + uint64_t run_start =3D 0; + bool in_run =3D false; + int ret =3D 0; + + if (QLIST_EMPTY(&rdm->source_list)) { + if (replay_populated) { + return replay_fn(section, user_opaque); + } + return 0; + } + + g_assert(granularity !=3D 0); + + offset =3D section->offset_within_region; + end_offset =3D offset + int128_get64(section->size); + + while (offset < end_offset) { + MemoryRegionSection subsection =3D { + .mr =3D section->mr, + .offset_within_region =3D offset, + .size =3D int128_make64(MIN(granularity, end_offset - offset)), + }; + bool all_populated; + bool included; + + all_populated =3D rdm_is_all_populated_skip(rdm, &subsection, skip= _source); + included =3D replay_populated ? all_populated : !all_populated; + + if (included) { + if (!in_run) { + run_start =3D offset; + in_run =3D true; + } + } else { + if (in_run) { + MemoryRegionSection run_section =3D { + .mr =3D section->mr, + .offset_within_region =3D run_start, + .size =3D int128_make64(offset - run_start), + }; + ret =3D replay_fn(&run_section, user_opaque); + if (ret) { + return ret; + } + in_run =3D false; + } + } + if (granularity > end_offset - offset) { + break; + } + offset +=3D granularity; + } + + if (in_run) { + MemoryRegionSection run_section =3D { + .mr =3D section->mr, + .offset_within_region =3D run_start, + .size =3D int128_make64(end_offset - run_start), + }; + ret =3D replay_fn(&run_section, user_opaque); + } + + return ret; +} + +static int add_source_check_discard_cb(const MemoryRegionSection *section, + void *opaque) +{ + SourceNotifyCtx *ctx =3D opaque; + + return replay_by_populated_state(ctx->rdm, section, ctx->source, true, + rdl_discard_cb, ctx->rdl); +} + +static int del_source_check_populate_cb(const MemoryRegionSection *section, + void *opaque) +{ + SourceNotifyCtx *ctx =3D opaque; + + return replay_by_populated_state(ctx->rdm, section, ctx->source, true, + rdl_populate_cb, ctx->rdl); +} + +int ram_discard_manager_add_source(RamDiscardManager *rdm, + RamDiscardSource *source) +{ + RamDiscardSourceEntry *entry; + RamDiscardListener *rdl, *rdl2; + int ret =3D 0; + + if (ram_discard_manager_find_source(rdm, source)) { + return -EBUSY; + } + + /* + * If there are existing listeners, notify them about regions that + * become discarded due to adding this source. Only notify for regions + * that were previously populated (all other sources agreed). + */ + QLIST_FOREACH(rdl, &rdm->rdl_list, next) { + SourceNotifyCtx ctx =3D { + .rdm =3D rdm, + .rdl =3D rdl, + /* no need to set source */ + }; + ret =3D ram_discard_source_replay_discarded(source, rdl->section, + add_source_check_discard= _cb, &ctx); + if (ret) { + break; + } + } + if (ret) { + QLIST_FOREACH(rdl2, &rdm->rdl_list, next) { + SourceNotifyCtx ctx =3D { + .rdm =3D rdm, + .rdl =3D rdl2, + }; + ram_discard_source_replay_discarded(source, rdl2->section, + del_source_check_populate_= cb, &ctx); + if (rdl =3D=3D rdl2) { + break; + } + } + + return ret; + } + + entry =3D g_new0(RamDiscardSourceEntry, 1); + entry->rds =3D source; + QLIST_INSERT_HEAD(&rdm->source_list, entry, next); + + ram_discard_manager_update_granularity(rdm); + + return ret; +} + +int ram_discard_manager_del_source(RamDiscardManager *rdm, + RamDiscardSource *source) +{ + RamDiscardSourceEntry *entry; + RamDiscardListener *rdl, *rdl2; + int ret =3D 0; + + entry =3D ram_discard_manager_find_source(rdm, source); + if (!entry) { + return -ENOENT; + } + + /* + * If there are existing listeners, check if any regions become + * populated due to removing this source. + */ + QLIST_FOREACH(rdl, &rdm->rdl_list, next) { + SourceNotifyCtx ctx =3D { + .rdm =3D rdm, + .rdl =3D rdl, + .source =3D source, + }; + /* from the previously discarded regions, check if any regions bec= ome populated */ + ret =3D ram_discard_source_replay_discarded(source, rdl->section, + del_source_check_populat= e_cb, &ctx); + if (ret) { + break; + } + } + if (ret) { + QLIST_FOREACH(rdl2, &rdm->rdl_list, next) { + SourceNotifyCtx ctx =3D { + .rdm =3D rdm, + .rdl =3D rdl2, + .source =3D source, + }; + ram_discard_source_replay_discarded(source, rdl2->section, + add_source_check_discard_c= b, &ctx); + if (rdl =3D=3D rdl2) { + break; + } + } + + return ret; + } + + QLIST_REMOVE(entry, next); + g_free(entry); + ram_discard_manager_update_granularity(rdm); + return ret; +} + uint64_t ram_discard_manager_get_min_granularity(const RamDiscardManager *= rdm, const MemoryRegion *mr) { - return ram_discard_source_get_min_granularity(rdm->rds, mr); + g_assert(mr =3D=3D rdm->mr); + return rdm->min_granularity; } =20 +/* + * Aggregated query: returns true only if ALL sources report populated (AN= D). + */ bool ram_discard_manager_is_populated(const RamDiscardManager *rdm, const MemoryRegionSection *section) { - return ram_discard_source_is_populated(rdm->rds, section); + RamDiscardSourceEntry *entry; + + QLIST_FOREACH(entry, &rdm->source_list, next) { + if (!ram_discard_source_is_populated(entry->rds, section)) { + return false; + } + } + return true; +} + +typedef struct ReplayCtx { + const RamDiscardManager *rdm; + ReplayRamDiscardState replay_fn; + void *user_opaque; +} ReplayCtx; + +static int aggregated_replay_populated_cb(const MemoryRegionSection *secti= on, + void *opaque) +{ + ReplayCtx *ctx =3D opaque; + + return replay_by_populated_state(ctx->rdm, section, NULL, true, + ctx->replay_fn, ctx->user_opaque); } =20 int ram_discard_manager_replay_populated(const RamDiscardManager *rdm, @@ -78,8 +392,21 @@ int ram_discard_manager_replay_populated(const RamDisca= rdManager *rdm, ReplayRamDiscardState replay_fn, void *opaque) { - return ram_discard_source_replay_populated(rdm->rds, section, - replay_fn, opaque); + RamDiscardSourceEntry *first; + ReplayCtx ctx; + + first =3D QLIST_FIRST(&rdm->source_list); + if (!first) { + return replay_fn(section, opaque); + } + + ctx.rdm =3D rdm; + ctx.replay_fn =3D replay_fn; + ctx.user_opaque =3D opaque; + + return ram_discard_source_replay_populated(first->rds, section, + aggregated_replay_populated= _cb, + &ctx); } =20 int ram_discard_manager_replay_discarded(const RamDiscardManager *rdm, @@ -87,15 +414,21 @@ int ram_discard_manager_replay_discarded(const RamDisc= ardManager *rdm, ReplayRamDiscardState replay_fn, void *opaque) { - return ram_discard_source_replay_discarded(rdm->rds, section, - replay_fn, opaque); + /* No sources means nothing is discarded (all is considered populated)= */ + if (QLIST_EMPTY(&rdm->source_list)) { + return 0; + } + + return replay_by_populated_state(rdm, section, NULL, false, replay_fn,= opaque); } =20 static void ram_discard_manager_initfn(Object *obj) { RamDiscardManager *rdm =3D RAM_DISCARD_MANAGER(obj); =20 + QLIST_INIT(&rdm->source_list); QLIST_INIT(&rdm->rdl_list); + rdm->min_granularity =3D 0; } =20 static void ram_discard_manager_finalize(Object *obj) @@ -103,74 +436,91 @@ static void ram_discard_manager_finalize(Object *obj) RamDiscardManager *rdm =3D RAM_DISCARD_MANAGER(obj); =20 g_assert(QLIST_EMPTY(&rdm->rdl_list)); + g_assert(QLIST_EMPTY(&rdm->source_list)); } =20 int ram_discard_manager_notify_populate(RamDiscardManager *rdm, + RamDiscardSource *source, uint64_t offset, uint64_t size) { RamDiscardListener *rdl, *rdl2; + MemoryRegionSection section =3D { + .mr =3D rdm->mr, + .offset_within_region =3D offset, + .size =3D int128_make64(size), + }; int ret =3D 0; =20 - QLIST_FOREACH(rdl, &rdm->rdl_list, next) { - MemoryRegionSection tmp =3D *rdl->section; + g_assert(ram_discard_manager_find_source(rdm, source)); =20 - if (!memory_region_section_intersect_range(&tmp, offset, size)) { - continue; - } - ret =3D rdl->notify_populate(rdl, &tmp); + /* + * Only notify about regions that are populated in ALL sources. + * replay_by_populated_state checks all sources including the one that + * just populated. + */ + QLIST_FOREACH(rdl, &rdm->rdl_list, next) { + ret =3D replay_by_populated_state(rdm, §ion, NULL, true, + rdl_populate_cb, rdl); if (ret) { break; } } =20 if (ret) { - /* Notify all already-notified listeners about discard. */ + /* + * Rollback: notify discard for listeners we already notified, + * including the failing listener which may have been partially + * notified. Listeners must handle discard notifications for + * regions they didn't receive populate notifications for. + */ QLIST_FOREACH(rdl2, &rdm->rdl_list, next) { - MemoryRegionSection tmp =3D *rdl2->section; - + replay_by_populated_state(rdm, §ion, NULL, true, + rdl_discard_cb, rdl2); if (rdl2 =3D=3D rdl) { break; } - if (!memory_region_section_intersect_range(&tmp, offset, size)= ) { - continue; - } - rdl2->notify_discard(rdl2, &tmp); } } return ret; } =20 void ram_discard_manager_notify_discard(RamDiscardManager *rdm, + RamDiscardSource *source, uint64_t offset, uint64_t size) { RamDiscardListener *rdl; - + MemoryRegionSection section =3D { + .mr =3D rdm->mr, + .offset_within_region =3D offset, + .size =3D int128_make64(size), + }; + + g_assert(ram_discard_manager_find_source(rdm, source)); + + /* + * Only notify about ranges that were aggregately populated before this + * source's discard. Since the source has already updated its state, + * we use replay_by_populated_state with this source skipped - it will + * replay only the ranges where all OTHER sources are populated. + */ QLIST_FOREACH(rdl, &rdm->rdl_list, next) { - MemoryRegionSection tmp =3D *rdl->section; - - if (!memory_region_section_intersect_range(&tmp, offset, size)) { - continue; - } - rdl->notify_discard(rdl, &tmp); + replay_by_populated_state(rdm, §ion, source, true, + rdl_discard_cb, rdl); } } =20 -void ram_discard_manager_notify_discard_all(RamDiscardManager *rdm) +void ram_discard_manager_notify_discard_all(RamDiscardManager *rdm, + RamDiscardSource *source) { RamDiscardListener *rdl; =20 + g_assert(ram_discard_manager_find_source(rdm, source)); + QLIST_FOREACH(rdl, &rdm->rdl_list, next) { rdl->notify_discard(rdl, rdl->section); } } =20 -static int rdm_populate_cb(const MemoryRegionSection *section, void *opaqu= e) -{ - RamDiscardListener *rdl =3D opaque; - - return rdl->notify_populate(rdl, section); -} - void ram_discard_manager_register_listener(RamDiscardManager *rdm, RamDiscardListener *rdl, MemoryRegionSection *section) @@ -182,8 +532,8 @@ void ram_discard_manager_register_listener(RamDiscardMa= nager *rdm, rdl->section =3D memory_region_section_new_copy(section); QLIST_INSERT_HEAD(&rdm->rdl_list, rdl, next); =20 - ret =3D ram_discard_source_replay_populated(rdm->rds, rdl->section, - rdm_populate_cb, rdl); + ret =3D ram_discard_manager_replay_populated(rdm, rdl->section, + rdl_populate_cb, rdl); if (ret) { error_report("%s: Replaying populated ranges failed: %s", __func__, strerror(-ret)); @@ -208,8 +558,8 @@ int ram_discard_manager_replay_populated_to_listeners(R= amDiscardManager *rdm) int ret =3D 0; =20 QLIST_FOREACH(rdl, &rdm->rdl_list, next) { - ret =3D ram_discard_source_replay_populated(rdm->rds, rdl->section, - rdm_populate_cb, rdl); + ret =3D ram_discard_manager_replay_populated(rdm, rdl->section, + rdl_populate_cb, rdl); if (ret) { break; } --=20 2.52.0