[PATCH 10/22] qdev: Automatically delete memory subregions

Akihiko Odaki posted 22 patches 5 months, 1 week ago
Maintainers: Richard Henderson <richard.henderson@linaro.org>, Helge Deller <deller@gmx.de>, "Marc-André Lureau" <marcandre.lureau@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>, "Michael S. Tsirkin" <mst@redhat.com>, "Daniel P. Berrangé" <berrange@redhat.com>, Eduardo Habkost <eduardo@habkost.net>, Gerd Hoffmann <kraxel@redhat.com>, John Snow <jsnow@redhat.com>, Keith Busch <kbusch@kernel.org>, Klaus Jensen <its@irrelevant.dk>, Jesper Devantier <foss@defmacro.it>, "Hervé Poussineau" <hpoussin@reactos.org>, Marcel Apfelbaum <marcel.apfelbaum@gmail.com>, Nicholas Piggin <npiggin@gmail.com>, Harsh Prateek Bora <harshpb@linux.ibm.com>, Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>, Artyom Tarasenko <atar4qemu@gmail.com>, John Levon <john.levon@nutanix.com>, Thanos Makatos <thanos.makatos@nutanix.com>, "Cédric Le Goater" <clg@redhat.com>, Alex Williamson <alex.williamson@redhat.com>, Peter Xu <peterx@redhat.com>, David Hildenbrand <david@redhat.com>, "Philippe Mathieu-Daudé" <philmd@linaro.org>, Aurelien Jarno <aurelien@aurel32.net>, Jiaxun Yang <jiaxun.yang@flygoat.com>, Aleksandar Rikalo <arikalo@gmail.com>, Max Filippov <jcmvbkbc@gmail.com>
[PATCH 10/22] qdev: Automatically delete memory subregions
Posted by Akihiko Odaki 5 months, 1 week ago
A common pattern is that to delete memory subregions during realization
error handling and unrealization. pci automatically automatically
deletes the IO subregions, but the pattern is manually implemented
in other places, which is tedious and error-prone.

Implement the logic to delete subregions in qdev to cover all devices.

Signed-off-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
---
 MAINTAINERS            |  1 +
 include/hw/qdev-core.h |  1 +
 hw/core/qdev.c         | 14 ++++++++++++++
 stubs/memory.c         |  9 +++++++++
 stubs/meson.build      |  1 +
 5 files changed, 26 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 8147fff3523eaa45c4a0d2c21d40b4ade3f419ff..4665f0a4b7a513c5863f6d5227a0173c836505e6 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3212,6 +3212,7 @@ F: include/system/memory.h
 F: include/system/ram_addr.h
 F: include/system/ramblock.h
 F: include/system/memory_mapping.h
+F: stubs/memory.c
 F: system/dma-helpers.c
 F: system/ioport.c
 F: system/memory.c
diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
index 530f3da70218df59da72dc7a975dca8265600e00..8f443d5f8ea5f31d69181cc1ec53a0b022eb71cc 100644
--- a/include/hw/qdev-core.h
+++ b/include/hw/qdev-core.h
@@ -526,6 +526,7 @@ bool qdev_realize_and_unref(DeviceState *dev, BusState *bus, Error **errp);
  *  - unrealize any child buses by calling qbus_unrealize()
  *    (this will recursively unrealize any devices on those buses)
  *  - call the unrealize method of @dev
+ *  - remove @dev from memory
  *
  * The device can then be freed by causing its reference count to go
  * to zero.
diff --git a/hw/core/qdev.c b/hw/core/qdev.c
index f600226176871361d7ff3875f5d06bd4e614be6e..8fdf6774f87ec8424348e8c9652dc4c99a2faeb5 100644
--- a/hw/core/qdev.c
+++ b/hw/core/qdev.c
@@ -451,6 +451,17 @@ static bool check_only_migratable(Object *obj, Error **errp)
     return true;
 }
 
+static int del_memory_region(Object *child, void *opaque)
+{
+    MemoryRegion *mr = (MemoryRegion *)object_dynamic_cast(child, TYPE_MEMORY_REGION);
+
+    if (mr && mr->container) {
+        memory_region_del_subregion(mr->container, mr);
+    }
+
+    return 0;
+}
+
 static void device_set_realized(Object *obj, bool value, Error **errp)
 {
     DeviceState *dev = DEVICE(obj);
@@ -582,6 +593,7 @@ static void device_set_realized(Object *obj, bool value, Error **errp)
         if (dc->unrealize) {
             dc->unrealize(dev);
         }
+        object_child_foreach(OBJECT(dev), del_memory_region, NULL);
         dev->pending_deleted_event = true;
         DEVICE_LISTENER_CALL(unrealize, Reverse, dev);
     }
@@ -606,6 +618,8 @@ post_realize_fail:
     }
 
 fail:
+    object_child_foreach(OBJECT(dev), del_memory_region, NULL);
+
     error_propagate(errp, local_err);
     if (unattached_parent) {
         /*
diff --git a/stubs/memory.c b/stubs/memory.c
new file mode 100644
index 0000000000000000000000000000000000000000..9c36531ae542d804dc19ed2a3c657005881a2bca
--- /dev/null
+++ b/stubs/memory.c
@@ -0,0 +1,9 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+
+#include "qemu/osdep.h"
+#include "system/memory.h"
+
+void memory_region_del_subregion(MemoryRegion *mr,
+                                 MemoryRegion *subregion)
+{
+}
diff --git a/stubs/meson.build b/stubs/meson.build
index cef046e6854ddaa9f12714c317a541ea75b8d412..b4df4e60a1af89c9354d5b92449ce5409095b9f1 100644
--- a/stubs/meson.build
+++ b/stubs/meson.build
@@ -95,5 +95,6 @@ if have_system or have_user
 
   # Also included in have_system for tests/unit/test-qdev-global-props
   stub_ss.add(files('hotplug-stubs.c'))
+  stub_ss.add(files('memory.c'))
   stub_ss.add(files('sysbus.c'))
 endif

-- 
2.51.0
Re: [PATCH 10/22] qdev: Automatically delete memory subregions
Posted by Peter Xu 5 months ago
On Sat, Sep 06, 2025 at 04:11:19AM +0200, Akihiko Odaki wrote:
> A common pattern is that to delete memory subregions during realization
> error handling and unrealization. pci automatically automatically
> deletes the IO subregions, but the pattern is manually implemented
> in other places, which is tedious and error-prone.

I don't think they're the same?  What is the ultimate goal of this change?

PCI core only detachs all the BARs from the address space registered from
pci_register_bar() explicitly.  It's not an object_dynamic_cast() detaching
every MR not matter what it is..

The other thing it does is detaching the DMA root memory region.

I'm not fluent with pci, but IMHO it's good to keep those explicit.

> 
> Implement the logic to delete subregions in qdev to cover all devices.
> 
> Signed-off-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
> ---
>  MAINTAINERS            |  1 +
>  include/hw/qdev-core.h |  1 +
>  hw/core/qdev.c         | 14 ++++++++++++++
>  stubs/memory.c         |  9 +++++++++
>  stubs/meson.build      |  1 +
>  5 files changed, 26 insertions(+)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 8147fff3523eaa45c4a0d2c21d40b4ade3f419ff..4665f0a4b7a513c5863f6d5227a0173c836505e6 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -3212,6 +3212,7 @@ F: include/system/memory.h
>  F: include/system/ram_addr.h
>  F: include/system/ramblock.h
>  F: include/system/memory_mapping.h
> +F: stubs/memory.c
>  F: system/dma-helpers.c
>  F: system/ioport.c
>  F: system/memory.c
> diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
> index 530f3da70218df59da72dc7a975dca8265600e00..8f443d5f8ea5f31d69181cc1ec53a0b022eb71cc 100644
> --- a/include/hw/qdev-core.h
> +++ b/include/hw/qdev-core.h
> @@ -526,6 +526,7 @@ bool qdev_realize_and_unref(DeviceState *dev, BusState *bus, Error **errp);
>   *  - unrealize any child buses by calling qbus_unrealize()
>   *    (this will recursively unrealize any devices on those buses)
>   *  - call the unrealize method of @dev
> + *  - remove @dev from memory
>   *
>   * The device can then be freed by causing its reference count to go
>   * to zero.
> diff --git a/hw/core/qdev.c b/hw/core/qdev.c
> index f600226176871361d7ff3875f5d06bd4e614be6e..8fdf6774f87ec8424348e8c9652dc4c99a2faeb5 100644
> --- a/hw/core/qdev.c
> +++ b/hw/core/qdev.c
> @@ -451,6 +451,17 @@ static bool check_only_migratable(Object *obj, Error **errp)
>      return true;
>  }
>  
> +static int del_memory_region(Object *child, void *opaque)
> +{
> +    MemoryRegion *mr = (MemoryRegion *)object_dynamic_cast(child, TYPE_MEMORY_REGION);
> +
> +    if (mr && mr->container) {
> +        memory_region_del_subregion(mr->container, mr);
> +    }
> +
> +    return 0;
> +}
> +
>  static void device_set_realized(Object *obj, bool value, Error **errp)
>  {
>      DeviceState *dev = DEVICE(obj);
> @@ -582,6 +593,7 @@ static void device_set_realized(Object *obj, bool value, Error **errp)
>          if (dc->unrealize) {
>              dc->unrealize(dev);
>          }
> +        object_child_foreach(OBJECT(dev), del_memory_region, NULL);
>          dev->pending_deleted_event = true;
>          DEVICE_LISTENER_CALL(unrealize, Reverse, dev);
>      }
> @@ -606,6 +618,8 @@ post_realize_fail:
>      }
>  
>  fail:
> +    object_child_foreach(OBJECT(dev), del_memory_region, NULL);
> +
>      error_propagate(errp, local_err);
>      if (unattached_parent) {
>          /*
> diff --git a/stubs/memory.c b/stubs/memory.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..9c36531ae542d804dc19ed2a3c657005881a2bca
> --- /dev/null
> +++ b/stubs/memory.c
> @@ -0,0 +1,9 @@
> +/* SPDX-License-Identifier: GPL-2.0-or-later */
> +
> +#include "qemu/osdep.h"
> +#include "system/memory.h"
> +
> +void memory_region_del_subregion(MemoryRegion *mr,
> +                                 MemoryRegion *subregion)
> +{
> +}
> diff --git a/stubs/meson.build b/stubs/meson.build
> index cef046e6854ddaa9f12714c317a541ea75b8d412..b4df4e60a1af89c9354d5b92449ce5409095b9f1 100644
> --- a/stubs/meson.build
> +++ b/stubs/meson.build
> @@ -95,5 +95,6 @@ if have_system or have_user
>  
>    # Also included in have_system for tests/unit/test-qdev-global-props
>    stub_ss.add(files('hotplug-stubs.c'))
> +  stub_ss.add(files('memory.c'))
>    stub_ss.add(files('sysbus.c'))
>  endif
> 
> -- 
> 2.51.0
> 

-- 
Peter Xu
Re: [PATCH 10/22] qdev: Automatically delete memory subregions
Posted by Akihiko Odaki 5 months ago
On 2025/09/11 6:10, Peter Xu wrote:
> On Sat, Sep 06, 2025 at 04:11:19AM +0200, Akihiko Odaki wrote:
>> A common pattern is that to delete memory subregions during realization
>> error handling and unrealization. pci automatically automatically
>> deletes the IO subregions, but the pattern is manually implemented
>> in other places, which is tedious and error-prone.
> 
> I don't think they're the same?  What is the ultimate goal of this change?

Covering all devices in the common code is less tedious and error-prone 
because the same logic will not be duplicated by the subclasses.

> 
> PCI core only detachs all the BARs from the address space registered from
> pci_register_bar() explicitly.  It's not an object_dynamic_cast() detaching
> every MR not matter what it is..

All devices share one semantics: they are exposed to the guest only when 
they are realized. Therefore, every MRs should be detached after 
unrealization, no matter what it is.

> 
> The other thing it does is detaching the DMA root memory region.
> 
> I'm not fluent with pci, but IMHO it's good to keep those explicit.

Explicit detaching is "tedious and error-prone"; more concretely, there 
are the following two error situations:
1. Forget the detachment and cause a memory leak.
2. Perform the detachment after finalization, which could have happened 
with a manual detachment in "[PATCH 11/22] vfio-user: Do not delete the 
subregion".

Regards,
Akihiko Odaki