[PATCH 00/14] Fix memory region use-after-finalization

Akihiko Odaki posted 14 patches 4 months, 3 weeks ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20250917-subregion-v1-0-bef37d9b4f73@rsg.ci.i.u-tokyo.ac.jp
Maintainers: Richard Henderson <richard.henderson@linaro.org>, Helge Deller <deller@gmx.de>, "Marc-André Lureau" <marcandre.lureau@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>, "Michael S. Tsirkin" <mst@redhat.com>, "Daniel P. Berrangé" <berrange@redhat.com>, Eduardo Habkost <eduardo@habkost.net>, Gerd Hoffmann <kraxel@redhat.com>, John Snow <jsnow@redhat.com>, Keith Busch <kbusch@kernel.org>, Klaus Jensen <its@irrelevant.dk>, Jesper Devantier <foss@defmacro.it>, Marcel Apfelbaum <marcel.apfelbaum@gmail.com>, Nicholas Piggin <npiggin@gmail.com>, Daniel Henrique Barboza <danielhb413@gmail.com>, Harsh Prateek Bora <harshpb@linux.ibm.com>, John Levon <john.levon@nutanix.com>, Thanos Makatos <thanos.makatos@nutanix.com>, Peter Xu <peterx@redhat.com>, David Hildenbrand <david@redhat.com>, "Philippe Mathieu-Daudé" <philmd@linaro.org>
MAINTAINERS                |  1 +
include/hw/pci/pci.h       |  1 +
include/hw/qdev-core.h     |  1 +
hw/char/diva-gsp.c         |  1 -
hw/char/serial-pci-multi.c |  1 -
hw/core/qdev.c             | 14 +++++++
hw/display/vga-pci.c       |  8 ----
hw/ide/cmd646.c            | 12 ------
hw/ide/piix.c              | 13 -------
hw/ide/via.c               | 12 ------
hw/nvme/ctrl.c             |  2 -
hw/pci/pci.c               | 22 +----------
hw/pci/pci_bridge.c        | 96 +++++++++++++++++++++++++---------------------
hw/ppc/spapr_pci.c         | 22 -----------
hw/usb/hcd-ehci.c          |  4 --
hw/usb/hcd-xhci.c          | 10 -----
hw/vfio-user/pci.c         |  6 ---
stubs/memory.c             |  9 +++++
stubs/meson.build          |  1 +
19 files changed, 80 insertions(+), 156 deletions(-)
[PATCH 00/14] Fix memory region use-after-finalization
Posted by Akihiko Odaki 4 months, 3 weeks ago
Based-on: <20250917-use-v3-0-72c2a6887c6c@rsg.ci.i.u-tokyo.ac.jp>
("[PATCH v3 0/7] Do not unparent in instance_finalize()")

This patch series was spun off from "[PATCH v2 00/15] Fix memory region
leaks and use-after-finalization":
https://lore.kernel.org/qemu-devel/20250915-use-v2-0-f4c7ff13bfe9@rsg.ci.i.u-tokyo.ac.jp/

When developing the next version of "[PATCH 00/16] memory: Stop
piggybacking on memory region owners*", I faced multiple memory region
leaks and use-after-finalization. This series extracts their fixes so
that the number of Cc: won't explode.

Patch "qdev: Automatically delete memory subregions" and the succeeding
patches are for refactoring, but patch "vfio-user: Do not delete the
subregion" does fix use-after-finalization.

* https://lore.kernel.org/qemu-devel/20250901-mr-v1-0-dd7cb6b1480b@rsg.ci.i.u-tokyo.ac.jp/

Signed-off-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
---
Akihiko Odaki (14):
      hw/pci-bridge: Do not assume immediate MemoryRegion finalization
      qdev: Automatically delete memory subregions
      vfio-user: Do not delete the subregion
      hw/char/diva-gsp: Do not delete the subregion
      hw/char/serial-pci-multi: Do not delete the subregion
      secondary-vga: Do not delete the subregions
      cmd646: Do not delete the subregions
      hw/ide/piix: Do not delete the subregions
      hw/ide/via: Do not delete the subregions
      hw/nvme: Do not delete the subregion
      pci: Do not delete the subregions
      hw/ppc/spapr_pci: Do not delete the subregions
      hw/usb/hcd-ehci: Do not delete the subregions
      hw/usb/hcd-xhci: Do not delete the subregions

 MAINTAINERS                |  1 +
 include/hw/pci/pci.h       |  1 +
 include/hw/qdev-core.h     |  1 +
 hw/char/diva-gsp.c         |  1 -
 hw/char/serial-pci-multi.c |  1 -
 hw/core/qdev.c             | 14 +++++++
 hw/display/vga-pci.c       |  8 ----
 hw/ide/cmd646.c            | 12 ------
 hw/ide/piix.c              | 13 -------
 hw/ide/via.c               | 12 ------
 hw/nvme/ctrl.c             |  2 -
 hw/pci/pci.c               | 22 +----------
 hw/pci/pci_bridge.c        | 96 +++++++++++++++++++++++++---------------------
 hw/ppc/spapr_pci.c         | 22 -----------
 hw/usb/hcd-ehci.c          |  4 --
 hw/usb/hcd-xhci.c          | 10 -----
 hw/vfio-user/pci.c         |  6 ---
 stubs/memory.c             |  9 +++++
 stubs/meson.build          |  1 +
 19 files changed, 80 insertions(+), 156 deletions(-)
---
base-commit: e101d33792530093fa0b0a6e5f43e4d8cfe4581e
change-id: 20250917-subregion-907ced7da1ed
prerequisite-change-id: 20250906-use-37ecc903a9e0:v3
prerequisite-patch-id: d464fda86a3c79ff8e6d7a2e623d979b2a47019b
prerequisite-patch-id: 17b153237f69c898b9c5b93aad0d5116d0bfe49f
prerequisite-patch-id: ac51d9c4ac483054ee91cecbb5575def67dbb602
prerequisite-patch-id: 205aa86c0ef087c97dbcf736062661a45c287bf3
prerequisite-patch-id: 26e18a249afaf9cd1b72961f9e2e3ebf97966a3c
prerequisite-patch-id: d3e0b87f84a216e05bd4aa3dee8ae77cf9df062a
prerequisite-patch-id: 510a59304274e1bc35f8fbe77c91fc2f32a2f087

Best regards,
--  
Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Re: [PATCH 00/14] Fix memory region use-after-finalization
Posted by Paolo Bonzini 4 months, 1 week ago
On 9/17/25 12:32, Akihiko Odaki wrote:
> Based-on: <20250917-use-v3-0-72c2a6887c6c@rsg.ci.i.u-tokyo.ac.jp>
> ("[PATCH v3 0/7] Do not unparent in instance_finalize()")
> 
> This patch series was spun off from "[PATCH v2 00/15] Fix memory region
> leaks and use-after-finalization":
> https://lore.kernel.org/qemu-devel/20250915-use-v2-0-f4c7ff13bfe9@rsg.ci.i.u-tokyo.ac.jp/
> 
> When developing the next version of "[PATCH 00/16] memory: Stop
> piggybacking on memory region owners*", I faced multiple memory region
> leaks and use-after-finalization. This series extracts their fixes so
> that the number of Cc: won't explode.
> 
> Patch "qdev: Automatically delete memory subregions" and the succeeding
> patches are for refactoring, but patch "vfio-user: Do not delete the
> subregion" does fix use-after-finalization.
> 
> * https://lore.kernel.org/qemu-devel/20250901-mr-v1-0-dd7cb6b1480b@rsg.ci.i.u-tokyo.ac.jp/
> 
> Signed-off-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>

This makes sense, but I think it is not bisectable, because of this in 
memory_region_del_subregion():

     assert(subregion->container == mr);
     subregion->container = NULL;

You would need to add a temporary

     if (subregion->container == NULL) {
         return;
     }

and undo it at the end of the series.  Do you agree?  With this change I 
can apply it.

Paolo
Re: [PATCH 00/14] Fix memory region use-after-finalization
Posted by Akihiko Odaki 4 months ago
On 2025/10/03 0:03, Paolo Bonzini wrote:
> On 9/17/25 12:32, Akihiko Odaki wrote:
>> Based-on: <20250917-use-v3-0-72c2a6887c6c@rsg.ci.i.u-tokyo.ac.jp>
>> ("[PATCH v3 0/7] Do not unparent in instance_finalize()")
>>
>> This patch series was spun off from "[PATCH v2 00/15] Fix memory region
>> leaks and use-after-finalization":
>> https://lore.kernel.org/qemu-devel/20250915-use-v2-0- 
>> f4c7ff13bfe9@rsg.ci.i.u-tokyo.ac.jp/
>>
>> When developing the next version of "[PATCH 00/16] memory: Stop
>> piggybacking on memory region owners*", I faced multiple memory region
>> leaks and use-after-finalization. This series extracts their fixes so
>> that the number of Cc: won't explode.
>>
>> Patch "qdev: Automatically delete memory subregions" and the succeeding
>> patches are for refactoring, but patch "vfio-user: Do not delete the
>> subregion" does fix use-after-finalization.
>>
>> * https://lore.kernel.org/qemu-devel/20250901-mr-v1-0- 
>> dd7cb6b1480b@rsg.ci.i.u-tokyo.ac.jp/
>>
>> Signed-off-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
> 
> This makes sense, but I think it is not bisectable, because of this in 
> memory_region_del_subregion():
> 
>      assert(subregion->container == mr);
>      subregion->container = NULL;
> 
> You would need to add a temporary
> 
>      if (subregion->container == NULL) {
>          return;
>      }
> 
> and undo it at the end of the series.  Do you agree?  With this change I 
> can apply it.

It is unnecessary because patch "qdev: Automatically delete memory 
subregions" satisfies the following:

1. the device-specific code can assume that subregions they added are 
present until it finishes unrealization. The unrealize() callback can 
also assume the subregions are present and delete them. qdev satisfies 
this by deleting subregions only after calling the unrealize().

2. qdev should delete the remaining subregions before it finishes 
unrealization to ensure that the devices are hidden from the guest. qdev 
satisfies this by checking if memory regions have containers before 
deleting.

Regards,
Akihiko Odaki

Re: [PATCH 00/14] Fix memory region use-after-finalization
Posted by Peter Xu 4 months ago
On Fri, Oct 10, 2025 at 07:20:04PM +0900, Akihiko Odaki wrote:
> On 2025/10/03 0:03, Paolo Bonzini wrote:
> > On 9/17/25 12:32, Akihiko Odaki wrote:
> > > Based-on: <20250917-use-v3-0-72c2a6887c6c@rsg.ci.i.u-tokyo.ac.jp>
> > > ("[PATCH v3 0/7] Do not unparent in instance_finalize()")
> > > 
> > > This patch series was spun off from "[PATCH v2 00/15] Fix memory region
> > > leaks and use-after-finalization":
> > > https://lore.kernel.org/qemu-devel/20250915-use-v2-0-
> > > f4c7ff13bfe9@rsg.ci.i.u-tokyo.ac.jp/
> > > 
> > > When developing the next version of "[PATCH 00/16] memory: Stop
> > > piggybacking on memory region owners*", I faced multiple memory region
> > > leaks and use-after-finalization. This series extracts their fixes so
> > > that the number of Cc: won't explode.
> > > 
> > > Patch "qdev: Automatically delete memory subregions" and the succeeding
> > > patches are for refactoring, but patch "vfio-user: Do not delete the
> > > subregion" does fix use-after-finalization.
> > > 
> > > * https://lore.kernel.org/qemu-devel/20250901-mr-v1-0-
> > > dd7cb6b1480b@rsg.ci.i.u-tokyo.ac.jp/
> > > 
> > > Signed-off-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
> > 
> > This makes sense, but I think it is not bisectable, because of this in
> > memory_region_del_subregion():
> > 
> >      assert(subregion->container == mr);
> >      subregion->container = NULL;
> > 
> > You would need to add a temporary
> > 
> >      if (subregion->container == NULL) {
> >          return;
> >      }
> > 
> > and undo it at the end of the series.  Do you agree?  With this change I
> > can apply it.
> 
> It is unnecessary because patch "qdev: Automatically delete memory
> subregions" satisfies the following:
> 
> 1. the device-specific code can assume that subregions they added are
> present until it finishes unrealization. The unrealize() callback can also
> assume the subregions are present and delete them. qdev satisfies this by
> deleting subregions only after calling the unrealize().
> 
> 2. qdev should delete the remaining subregions before it finishes
> unrealization to ensure that the devices are hidden from the guest. qdev
> satisfies this by checking if memory regions have containers before
> deleting.

There's another option, which is to have the auto-detach patch for qdev
merged, meanwhile we can still keep the "good citizens" who do explicit
detachments in unrealize(), so that it won't confuse the reader on a
specific device about when did the MR went away.  I believe there will be
developers get confused by them otherwise.

The auto detach will only be a final safety belt to make sure MRs won't get
leaked.

If that is an option, it may also explain my original question on what
problem we're fixing... the extreme case is what this series removed
afterwards are exactly all such uses that may need an explicit detachment
of MRs (ignoring there can be owner-shared MR links, which is even out of
the picture).  Then the auto-detach feature may fix nothing, so we merge a
patch, looks good, logically sensible, but not functioning.

The reality might be, we may still have some qdev that should unlink the
MRs in unrealize(), but they didn't.  However again I suspect it's rare,
especially considering the recently merged auto-detach in finalize().  I'd
just go and fix them, but only if they're a real problem (aka,
unpluggable). Then, we know why we introduce a change, and especially, when
the change can go away.

OTOH, if we merge auto-detach of qdev MRs, we likely need to stick it
forever because justifying it not needed will be extremely hard if not
impossible.. after all we do not know what it fixes.  So if we can at least
list the devices that may be fixed with it in the commit message, it would
be a benefit.

Once again, feel free to treat my comment as a pure question.

Thanks,

-- 
Peter Xu