[PATCH 07/22] hw/pci-host/raven: Fix AddressSpace exposure timing

Akihiko Odaki posted 22 patches 5 months, 1 week ago
Maintainers: Richard Henderson <richard.henderson@linaro.org>, Helge Deller <deller@gmx.de>, "Marc-André Lureau" <marcandre.lureau@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>, "Michael S. Tsirkin" <mst@redhat.com>, "Daniel P. Berrangé" <berrange@redhat.com>, Eduardo Habkost <eduardo@habkost.net>, Gerd Hoffmann <kraxel@redhat.com>, John Snow <jsnow@redhat.com>, Keith Busch <kbusch@kernel.org>, Klaus Jensen <its@irrelevant.dk>, Jesper Devantier <foss@defmacro.it>, "Hervé Poussineau" <hpoussin@reactos.org>, Marcel Apfelbaum <marcel.apfelbaum@gmail.com>, Nicholas Piggin <npiggin@gmail.com>, Harsh Prateek Bora <harshpb@linux.ibm.com>, Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>, Artyom Tarasenko <atar4qemu@gmail.com>, John Levon <john.levon@nutanix.com>, Thanos Makatos <thanos.makatos@nutanix.com>, "Cédric Le Goater" <clg@redhat.com>, Alex Williamson <alex.williamson@redhat.com>, Peter Xu <peterx@redhat.com>, David Hildenbrand <david@redhat.com>, "Philippe Mathieu-Daudé" <philmd@linaro.org>, Aurelien Jarno <aurelien@aurel32.net>, Jiaxun Yang <jiaxun.yang@flygoat.com>, Aleksandar Rikalo <arikalo@gmail.com>, Max Filippov <jcmvbkbc@gmail.com>
[PATCH 07/22] hw/pci-host/raven: Fix AddressSpace exposure timing
Posted by Akihiko Odaki 5 months, 1 week ago
raven-pcihost is not hotpluggable but its instance can still be created
and finalized when processing the device-list-properties QMP command.
Exposing such a temporary instance to AddressSpace should be
avoided because it leaks the instance.

Expose instances to the AddressSpace at their realization time so that
it won't happen for the temporary instances.

Signed-off-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
---
 hw/pci-host/raven.c | 27 ++++++++++++++-------------
 1 file changed, 14 insertions(+), 13 deletions(-)

diff --git a/hw/pci-host/raven.c b/hw/pci-host/raven.c
index f8c0be5d21c351305742a696a65f70f87b546b0c..82f245c91cf267cdc6a518765a8c31d06eac7228 100644
--- a/hw/pci-host/raven.c
+++ b/hw/pci-host/raven.c
@@ -233,6 +233,20 @@ static void raven_pcihost_realizefn(DeviceState *d, Error **errp)
     MemoryRegion *address_space_mem = get_system_memory();
     int i;
 
+    address_space_init(&s->pci_io_as, &s->pci_io, "raven-io");
+
+    /* CPU address space */
+    memory_region_add_subregion(address_space_mem, PCI_IO_BASE_ADDR,
+                                &s->pci_io);
+    memory_region_add_subregion_overlap(address_space_mem, PCI_IO_BASE_ADDR,
+                                        &s->pci_io_non_contiguous, 1);
+    memory_region_add_subregion(address_space_mem, 0xc0000000, &s->pci_memory);
+    pci_root_bus_init(&s->pci_bus, sizeof(s->pci_bus), DEVICE(d), NULL,
+                      &s->pci_memory, &s->pci_io, 0, TYPE_PCI_BUS);
+
+    address_space_init(&s->bm_as, &s->bm, "raven-bm");
+    pci_setup_iommu(&s->pci_bus, &raven_iommu_ops, s);
+
     /*
      * According to PReP specification section 6.1.6 "System Interrupt
      * Assignments", all PCI interrupts are routed via IRQ 15
@@ -276,14 +290,12 @@ static void raven_pcihost_initfn(Object *obj)
 {
     PCIHostState *h = PCI_HOST_BRIDGE(obj);
     PREPPCIState *s = RAVEN_PCI_HOST_BRIDGE(obj);
-    MemoryRegion *address_space_mem = get_system_memory();
     DeviceState *pci_dev;
 
     memory_region_init(&s->pci_io, obj, "pci-io", 0x3f800000);
     memory_region_init_io(&s->pci_io_non_contiguous, obj, &raven_io_ops, s,
                           "pci-io-non-contiguous", 0x00800000);
     memory_region_init(&s->pci_memory, obj, "pci-memory", 0x3f000000);
-    address_space_init(&s->pci_io_as, &s->pci_io, "raven-io");
 
     /*
      * Raven's raven_io_ops use the address-space API to access pci-conf-idx
@@ -292,15 +304,6 @@ static void raven_pcihost_initfn(Object *obj)
      */
     s->pci_io_non_contiguous.disable_reentrancy_guard = true;
 
-    /* CPU address space */
-    memory_region_add_subregion(address_space_mem, PCI_IO_BASE_ADDR,
-                                &s->pci_io);
-    memory_region_add_subregion_overlap(address_space_mem, PCI_IO_BASE_ADDR,
-                                        &s->pci_io_non_contiguous, 1);
-    memory_region_add_subregion(address_space_mem, 0xc0000000, &s->pci_memory);
-    pci_root_bus_init(&s->pci_bus, sizeof(s->pci_bus), DEVICE(obj), NULL,
-                      &s->pci_memory, &s->pci_io, 0, TYPE_PCI_BUS);
-
     /* Bus master address space */
     memory_region_init(&s->bm, obj, "bm-raven", 4 * GiB);
     memory_region_init_alias(&s->bm_pci_memory_alias, obj, "bm-pci-memory",
@@ -310,8 +313,6 @@ static void raven_pcihost_initfn(Object *obj)
                              get_system_memory(), 0, 0x80000000);
     memory_region_add_subregion(&s->bm, 0         , &s->bm_pci_memory_alias);
     memory_region_add_subregion(&s->bm, 0x80000000, &s->bm_ram_alias);
-    address_space_init(&s->bm_as, &s->bm, "raven-bm");
-    pci_setup_iommu(&s->pci_bus, &raven_iommu_ops, s);
 
     h->bus = &s->pci_bus;
 

-- 
2.51.0
Re: [PATCH 07/22] hw/pci-host/raven: Fix AddressSpace exposure timing
Posted by BALATON Zoltan 5 months, 1 week ago
On Sat, 6 Sep 2025, Akihiko Odaki wrote:
> raven-pcihost is not hotpluggable but its instance can still be created
> and finalized when processing the device-list-properties QMP command.
> Exposing such a temporary instance to AddressSpace should be
> avoided because it leaks the instance.

This may suggest there's an issue with freeing address spaces that should 
be fixed there at one place if possible rather than adding a new rule to 
remember not to create address spaces in init methods. Should adress space 
be a child of the device too so it's freed with it? This series simplifies 
some devices removing the rule to remember to unparent memory regions but 
this now adds another rule to remember avoid creating address spaces in 
init so overall we're in the same situation and still have to work around 
issues. This trades one workaround for another so still cannot fix all 
issues with freeing all created objects.

> Expose instances to the AddressSpace at their realization time so that
> it won't happen for the temporary instances.

I had a series here: 
https://patchew.org/QEMU/cover.1751493467.git.balaton@eik.bme.hu/
that changes this for raven (among other clean ups) but I could not get 
that merged in the last devel cycle because of PPC being a bit 
unmaintained. I'd prefer that series to be taken first instead of this 
patch so I don't have to rebase that.

Regards,
BALATON Zoltan

> Signed-off-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
> ---
> hw/pci-host/raven.c | 27 ++++++++++++++-------------
> 1 file changed, 14 insertions(+), 13 deletions(-)
>
> diff --git a/hw/pci-host/raven.c b/hw/pci-host/raven.c
> index f8c0be5d21c351305742a696a65f70f87b546b0c..82f245c91cf267cdc6a518765a8c31d06eac7228 100644
> --- a/hw/pci-host/raven.c
> +++ b/hw/pci-host/raven.c
> @@ -233,6 +233,20 @@ static void raven_pcihost_realizefn(DeviceState *d, Error **errp)
>     MemoryRegion *address_space_mem = get_system_memory();
>     int i;
>
> +    address_space_init(&s->pci_io_as, &s->pci_io, "raven-io");
> +
> +    /* CPU address space */
> +    memory_region_add_subregion(address_space_mem, PCI_IO_BASE_ADDR,
> +                                &s->pci_io);
> +    memory_region_add_subregion_overlap(address_space_mem, PCI_IO_BASE_ADDR,
> +                                        &s->pci_io_non_contiguous, 1);
> +    memory_region_add_subregion(address_space_mem, 0xc0000000, &s->pci_memory);
> +    pci_root_bus_init(&s->pci_bus, sizeof(s->pci_bus), DEVICE(d), NULL,
> +                      &s->pci_memory, &s->pci_io, 0, TYPE_PCI_BUS);
> +
> +    address_space_init(&s->bm_as, &s->bm, "raven-bm");
> +    pci_setup_iommu(&s->pci_bus, &raven_iommu_ops, s);
> +
>     /*
>      * According to PReP specification section 6.1.6 "System Interrupt
>      * Assignments", all PCI interrupts are routed via IRQ 15
> @@ -276,14 +290,12 @@ static void raven_pcihost_initfn(Object *obj)
> {
>     PCIHostState *h = PCI_HOST_BRIDGE(obj);
>     PREPPCIState *s = RAVEN_PCI_HOST_BRIDGE(obj);
> -    MemoryRegion *address_space_mem = get_system_memory();
>     DeviceState *pci_dev;
>
>     memory_region_init(&s->pci_io, obj, "pci-io", 0x3f800000);
>     memory_region_init_io(&s->pci_io_non_contiguous, obj, &raven_io_ops, s,
>                           "pci-io-non-contiguous", 0x00800000);
>     memory_region_init(&s->pci_memory, obj, "pci-memory", 0x3f000000);
> -    address_space_init(&s->pci_io_as, &s->pci_io, "raven-io");
>
>     /*
>      * Raven's raven_io_ops use the address-space API to access pci-conf-idx
> @@ -292,15 +304,6 @@ static void raven_pcihost_initfn(Object *obj)
>      */
>     s->pci_io_non_contiguous.disable_reentrancy_guard = true;
>
> -    /* CPU address space */
> -    memory_region_add_subregion(address_space_mem, PCI_IO_BASE_ADDR,
> -                                &s->pci_io);
> -    memory_region_add_subregion_overlap(address_space_mem, PCI_IO_BASE_ADDR,
> -                                        &s->pci_io_non_contiguous, 1);
> -    memory_region_add_subregion(address_space_mem, 0xc0000000, &s->pci_memory);
> -    pci_root_bus_init(&s->pci_bus, sizeof(s->pci_bus), DEVICE(obj), NULL,
> -                      &s->pci_memory, &s->pci_io, 0, TYPE_PCI_BUS);
> -
>     /* Bus master address space */
>     memory_region_init(&s->bm, obj, "bm-raven", 4 * GiB);
>     memory_region_init_alias(&s->bm_pci_memory_alias, obj, "bm-pci-memory",
> @@ -310,8 +313,6 @@ static void raven_pcihost_initfn(Object *obj)
>                              get_system_memory(), 0, 0x80000000);
>     memory_region_add_subregion(&s->bm, 0         , &s->bm_pci_memory_alias);
>     memory_region_add_subregion(&s->bm, 0x80000000, &s->bm_ram_alias);
> -    address_space_init(&s->bm_as, &s->bm, "raven-bm");
> -    pci_setup_iommu(&s->pci_bus, &raven_iommu_ops, s);
>
>     h->bus = &s->pci_bus;
>
>
>
Re: [PATCH 07/22] hw/pci-host/raven: Fix AddressSpace exposure timing
Posted by Akihiko Odaki 5 months ago
On 2025/09/06 18:03, BALATON Zoltan wrote:
> On Sat, 6 Sep 2025, Akihiko Odaki wrote:
>> raven-pcihost is not hotpluggable but its instance can still be created
>> and finalized when processing the device-list-properties QMP command.
>> Exposing such a temporary instance to AddressSpace should be
>> avoided because it leaks the instance.
> 
> This may suggest there's an issue with freeing address spaces that 
> should be fixed there at one place if possible rather than adding a new 
> rule to remember not to create address spaces in init methods. Should 
> adress space be a child of the device too so it's freed with it? This 
> series simplifies some devices removing the rule to remember to unparent 
> memory regions but this now adds another rule to remember avoid creating 
> address spaces in init so overall we're in the same situation and still 
> have to work around issues. This trades one workaround for another so 
> still cannot fix all issues with freeing all created objects.

Making address spaces children will fix memory leaks, but it still 
leaves external side effects in init methods. Avoiding external side 
effect in init methods is necessary because devices can be created and 
removed at any time for introspection and it is not a new rule.

It is not necessary to remember to avoid creating address spaces in 
init; roughly speaking, int methods should only add properties and 
anything else should be done during realization.

"[PATCH v2 0/3] memory: Stop piggybacking on memory region owners" will 
cause an assertion failure if someone still manages to create an address 
space accidentally in init, so it is practically not possible to make 
such a mistake.

This series do not remove the rule to remember to unparent memory 
regions; it instead removes the rule to remember to delete memory 
regions from containers, which are distinct operations. And you still 
cannot add memory regions to containers during initialization; you 
should instead add them during realization just as like address space 
creation.

> 
>> Expose instances to the AddressSpace at their realization time so that
>> it won't happen for the temporary instances.
> 
> I had a series here: https://patchew.org/QEMU/ 
> cover.1751493467.git.balaton@eik.bme.hu/
> that changes this for raven (among other clean ups) but I could not get 
> that merged in the last devel cycle because of PPC being a bit 
> unmaintained. I'd prefer that series to be taken first instead of this 
> patch so I don't have to rebase that.

Looking at the series, "[PATCH v2 13/14] hw/pci-host/raven: Do not map 
regions in init method" moves memory_region_init() and 
memory_region_init_io() from raven_pcihost_initfn() to 
raven_pcihost_realize(). This should be avoided because these function 
calls add memory regions as child properties, which may be introspected 
without realization. Perhaps you may drop the patch and rebase your 
series on top of this patch, or let me rebase this patch on that series 
without it.

I wrote a documentation update to reflect my understanding of 
initialization and realization so please check it out too.
https://patchew.org/QEMU/20250908-qdev-v1-1-df236f7ce5bd@rsg.ci.i.u-tokyo.ac.jp/

Regards,
Akihiko Odaki
Re: [PATCH 07/22] hw/pci-host/raven: Fix AddressSpace exposure timing
Posted by Peter Maydell 5 months ago
On Mon, 8 Sept 2025 at 16:21, Akihiko Odaki
<odaki@rsg.ci.i.u-tokyo.ac.jp> wrote:
>
> On 2025/09/06 18:03, BALATON Zoltan wrote:
> > I had a series here: https://patchew.org/QEMU/
> > cover.1751493467.git.balaton@eik.bme.hu/
> > that changes this for raven (among other clean ups) but I could not get
> > that merged in the last devel cycle because of PPC being a bit
> > unmaintained. I'd prefer that series to be taken first instead of this
> > patch so I don't have to rebase that.
>
> Looking at the series, "[PATCH v2 13/14] hw/pci-host/raven: Do not map
> regions in init method" moves memory_region_init() and
> memory_region_init_io() from raven_pcihost_initfn() to
> raven_pcihost_realize(). This should be avoided because these function
> calls add memory regions as child properties, which may be introspected
> without realization. Perhaps you may drop the patch and rebase your
> series on top of this patch, or let me rebase this patch on that series
> without it.

I think that patch is fine. It's OK to init MRs either in
instance_init or in realize, whichever is convenient for the
device. Initing an MR adds a child QOM property but they are
not intended for generic introspection. A MemoryRegion is not
part of the public interface to a device unless you call
sysbus_init_mmio() to declare it so.

This is part of a general issue we have with QOM, where we
use QOM properties both for "this is part of the public-facing
interface to this device" and also for "this is internal and
really we're only using a QOM child property for convenience
of having things automatically cleaned up".

thanks
-- PMM