[v3] Maintainer updates for May (testing, plugins, virtio-gpu) - pre-PR

[PATCH v3 12/20] virtio-gpu: fix hang under TCG when unmapping blob

Posted by Alex Bennée 8 months, 3 weeks ago

From: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>

This commit fixes an indefinite hang when using VIRTIO GPU blob objects
under TCG in certain conditions.

The VIRTIO_GPU_CMD_RESOURCE_MAP_BLOB VIRTIO command creates a
MemoryRegion and attaches it to an offset on a PCI BAR of the
VirtIOGPUdevice. The VIRTIO_GPU_CMD_RESOURCE_UNMAP_BLOB command unmaps
it.

Because virglrenderer commands are not thread-safe they are only
called on the main context and QEMU performs the cleanup in three steps
to prevent a use-after-free scenario where the guest can access the
region after it’s unmapped:

1. From the main context, the region’s field finish_unmapping is false
   by default, so it sets a variable cmd_suspended, increases the
   renderer_blocked variable, deletes the blob subregion, and unparents
   the blob subregion causing its reference count to decrement.

2. From an RCU context, the MemoryView gets freed, the FlatView gets
   recalculated, the free callback of the blob region
   virtio_gpu_virgl_hostmem_region_free is called which sets the
   region’s field finish_unmapping to true, allowing the main thread
   context to finish replying to the command

3. From the main context, the command is processed again, but this time
   finish_unmapping is true, so virgl_renderer_resource_unmap can be
   called and a response is sent to the guest.

It happens so that under TCG, if the guest has no timers configured (and
thus no interrupt will cause the CPU to exit), the RCU thread does not
have enough time to grab the locks and recalculate the FlatView.

That’s not a big problem in practice since most guests will assume a
response will happen later in time and go on to do different things,
potentially triggering interrupts and allowing the RCU context to run.
If the guest waits for the unmap command to complete though, it blocks
indefinitely. Attaching to the QEMU monitor and force quitting the guest
allows the cleanup to continue.

There's no reason why the FlatView recalculation can't occur right away
when we delete the blob subregion, however. It does not, because when we
create the subregion we set the object as its own parent:

    memory_region_init_ram_ptr(mr, OBJECT(mr), "blob", size, data);

The extra reference is what prevents freeing the memory region object in
the memory transaction of deleting the subregion.

This commit changes the owner object to the device, which removes the
extra owner reference in the memory region and causes the MR to be
freed right away in the main context.

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20250410122643.1747913-3-manos.pitsidianakis@linaro.org>
Cc: qemu-stable@nongnu.org
---
 hw/display/virtio-gpu-virgl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c
index 71a7500de9..8fbe4e70cc 100644
--- a/hw/display/virtio-gpu-virgl.c
+++ b/hw/display/virtio-gpu-virgl.c
@@ -112,7 +112,7 @@ virtio_gpu_virgl_map_resource_blob(VirtIOGPU *g,
     vmr->g = g;
     mr = g_new0(MemoryRegion, 1);
 
-    memory_region_init_ram_ptr(mr, OBJECT(mr), "blob", size, data);
+    memory_region_init_ram_ptr(mr, OBJECT(g), "blob", size, data);
     memory_region_add_subregion(&b->hostmem, offset, mr);
     memory_region_set_enabled(mr, true);
 
-- 
2.39.5

Re: [PATCH v3 12/20] virtio-gpu: fix hang under TCG when unmapping blob

Posted by Akihiko Odaki 8 months, 3 weeks ago

On 2025/05/22 1:42, Alex Bennée wrote:
> From: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
> 
> This commit fixes an indefinite hang when using VIRTIO GPU blob objects
> under TCG in certain conditions.
> 
> The VIRTIO_GPU_CMD_RESOURCE_MAP_BLOB VIRTIO command creates a
> MemoryRegion and attaches it to an offset on a PCI BAR of the
> VirtIOGPUdevice. The VIRTIO_GPU_CMD_RESOURCE_UNMAP_BLOB command unmaps
> it.
> 
> Because virglrenderer commands are not thread-safe they are only
> called on the main context and QEMU performs the cleanup in three steps
> to prevent a use-after-free scenario where the guest can access the
> region after it’s unmapped:
> 
> 1. From the main context, the region’s field finish_unmapping is false
>     by default, so it sets a variable cmd_suspended, increases the
>     renderer_blocked variable, deletes the blob subregion, and unparents
>     the blob subregion causing its reference count to decrement.
> 
> 2. From an RCU context, the MemoryView gets freed, the FlatView gets
>     recalculated, the free callback of the blob region
>     virtio_gpu_virgl_hostmem_region_free is called which sets the
>     region’s field finish_unmapping to true, allowing the main thread
>     context to finish replying to the command
> 
> 3. From the main context, the command is processed again, but this time
>     finish_unmapping is true, so virgl_renderer_resource_unmap can be
>     called and a response is sent to the guest.
> 
> It happens so that under TCG, if the guest has no timers configured (and
> thus no interrupt will cause the CPU to exit), the RCU thread does not
> have enough time to grab the locks and recalculate the FlatView.
> 
> That’s not a big problem in practice since most guests will assume a
> response will happen later in time and go on to do different things,
> potentially triggering interrupts and allowing the RCU context to run.
> If the guest waits for the unmap command to complete though, it blocks
> indefinitely. Attaching to the QEMU monitor and force quitting the guest
> allows the cleanup to continue.
> 
> There's no reason why the FlatView recalculation can't occur right away
> when we delete the blob subregion, however. It does not, because when we
> create the subregion we set the object as its own parent:
> 
>      memory_region_init_ram_ptr(mr, OBJECT(mr), "blob", size, data);
> 
> The extra reference is what prevents freeing the memory region object in
> the memory transaction of deleting the subregion.
> 
> This commit changes the owner object to the device, which removes the
> extra owner reference in the memory region and causes the MR to be
> freed right away in the main context.
> 
> Acked-by: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
> Tested-by: Alex Bennée <alex.bennee@linaro.org>
> Message-Id: <20250410122643.1747913-3-manos.pitsidianakis@linaro.org>
> Cc: qemu-stable@nongnu.org
> ---
>   hw/display/virtio-gpu-virgl.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c
> index 71a7500de9..8fbe4e70cc 100644
> --- a/hw/display/virtio-gpu-virgl.c
> +++ b/hw/display/virtio-gpu-virgl.c
> @@ -112,7 +112,7 @@ virtio_gpu_virgl_map_resource_blob(VirtIOGPU *g,
>       vmr->g = g;
>       mr = g_new0(MemoryRegion, 1);
>   
> -    memory_region_init_ram_ptr(mr, OBJECT(mr), "blob", size, data);
> +    memory_region_init_ram_ptr(mr, OBJECT(g), "blob", size, data);
>       memory_region_add_subregion(&b->hostmem, offset, mr);
>       memory_region_set_enabled(mr, true);
>   

I suggest dropping this patch for now due to the reason I pointed out 
for the first version of this series.

Re: [PATCH v3 12/20] virtio-gpu: fix hang under TCG when unmapping blob

Posted by Alex Bennée 8 months, 3 weeks ago

Akihiko Odaki <akihiko.odaki@daynix.com> writes:

> On 2025/05/22 1:42, Alex Bennée wrote:
>> From: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
>> This commit fixes an indefinite hang when using VIRTIO GPU blob
>> objects
>> under TCG in certain conditions.
>> The VIRTIO_GPU_CMD_RESOURCE_MAP_BLOB VIRTIO command creates a
>> MemoryRegion and attaches it to an offset on a PCI BAR of the
>> VirtIOGPUdevice. The VIRTIO_GPU_CMD_RESOURCE_UNMAP_BLOB command unmaps
>> it.
>> Because virglrenderer commands are not thread-safe they are only
>> called on the main context and QEMU performs the cleanup in three steps
>> to prevent a use-after-free scenario where the guest can access the
>> region after it’s unmapped:
>> 1. From the main context, the region’s field finish_unmapping is
>> false
>>     by default, so it sets a variable cmd_suspended, increases the
>>     renderer_blocked variable, deletes the blob subregion, and unparents
>>     the blob subregion causing its reference count to decrement.
>> 2. From an RCU context, the MemoryView gets freed, the FlatView gets
>>     recalculated, the free callback of the blob region
>>     virtio_gpu_virgl_hostmem_region_free is called which sets the
>>     region’s field finish_unmapping to true, allowing the main thread
>>     context to finish replying to the command
>> 3. From the main context, the command is processed again, but this
>> time
>>     finish_unmapping is true, so virgl_renderer_resource_unmap can be
>>     called and a response is sent to the guest.
>> It happens so that under TCG, if the guest has no timers configured
>> (and
>> thus no interrupt will cause the CPU to exit), the RCU thread does not
>> have enough time to grab the locks and recalculate the FlatView.
>> That’s not a big problem in practice since most guests will assume a
>> response will happen later in time and go on to do different things,
>> potentially triggering interrupts and allowing the RCU context to run.
>> If the guest waits for the unmap command to complete though, it blocks
>> indefinitely. Attaching to the QEMU monitor and force quitting the guest
>> allows the cleanup to continue.
>> There's no reason why the FlatView recalculation can't occur right
>> away
>> when we delete the blob subregion, however. It does not, because when we
>> create the subregion we set the object as its own parent:
>>      memory_region_init_ram_ptr(mr, OBJECT(mr), "blob", size, data);
>> The extra reference is what prevents freeing the memory region
>> object in
>> the memory transaction of deleting the subregion.
>> This commit changes the owner object to the device, which removes
>> the
>> extra owner reference in the memory region and causes the MR to be
>> freed right away in the main context.
>> Acked-by: Michael S. Tsirkin <mst@redhat.com>
>> Signed-off-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
>> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
>> Tested-by: Alex Bennée <alex.bennee@linaro.org>
>> Message-Id: <20250410122643.1747913-3-manos.pitsidianakis@linaro.org>
>> Cc: qemu-stable@nongnu.org
>> ---
>>   hw/display/virtio-gpu-virgl.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>> diff --git a/hw/display/virtio-gpu-virgl.c
>> b/hw/display/virtio-gpu-virgl.c
>> index 71a7500de9..8fbe4e70cc 100644
>> --- a/hw/display/virtio-gpu-virgl.c
>> +++ b/hw/display/virtio-gpu-virgl.c
>> @@ -112,7 +112,7 @@ virtio_gpu_virgl_map_resource_blob(VirtIOGPU *g,
>>       vmr->g = g;
>>       mr = g_new0(MemoryRegion, 1);
>>   -    memory_region_init_ram_ptr(mr, OBJECT(mr), "blob", size,
>> data);
>> +    memory_region_init_ram_ptr(mr, OBJECT(g), "blob", size, data);
>>       memory_region_add_subregion(&b->hostmem, offset, mr);
>>       memory_region_set_enabled(mr, true);
>>   
>
> I suggest dropping this patch for now due to the reason I pointed out
> for the first version of this series.

This fixes an actual bug - without it we get a hang. 

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro

Re: [PATCH v3 12/20] virtio-gpu: fix hang under TCG when unmapping blob

Posted by Akihiko Odaki 8 months, 3 weeks ago

On 2025/05/22 15:45, Alex Bennée wrote:
> Akihiko Odaki <akihiko.odaki@daynix.com> writes:
> 
>> On 2025/05/22 1:42, Alex Bennée wrote:
>>> From: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
>>> This commit fixes an indefinite hang when using VIRTIO GPU blob
>>> objects
>>> under TCG in certain conditions.
>>> The VIRTIO_GPU_CMD_RESOURCE_MAP_BLOB VIRTIO command creates a
>>> MemoryRegion and attaches it to an offset on a PCI BAR of the
>>> VirtIOGPUdevice. The VIRTIO_GPU_CMD_RESOURCE_UNMAP_BLOB command unmaps
>>> it.
>>> Because virglrenderer commands are not thread-safe they are only
>>> called on the main context and QEMU performs the cleanup in three steps
>>> to prevent a use-after-free scenario where the guest can access the
>>> region after it’s unmapped:
>>> 1. From the main context, the region’s field finish_unmapping is
>>> false
>>>      by default, so it sets a variable cmd_suspended, increases the
>>>      renderer_blocked variable, deletes the blob subregion, and unparents
>>>      the blob subregion causing its reference count to decrement.
>>> 2. From an RCU context, the MemoryView gets freed, the FlatView gets
>>>      recalculated, the free callback of the blob region
>>>      virtio_gpu_virgl_hostmem_region_free is called which sets the
>>>      region’s field finish_unmapping to true, allowing the main thread
>>>      context to finish replying to the command
>>> 3. From the main context, the command is processed again, but this
>>> time
>>>      finish_unmapping is true, so virgl_renderer_resource_unmap can be
>>>      called and a response is sent to the guest.
>>> It happens so that under TCG, if the guest has no timers configured
>>> (and
>>> thus no interrupt will cause the CPU to exit), the RCU thread does not
>>> have enough time to grab the locks and recalculate the FlatView.
>>> That’s not a big problem in practice since most guests will assume a
>>> response will happen later in time and go on to do different things,
>>> potentially triggering interrupts and allowing the RCU context to run.
>>> If the guest waits for the unmap command to complete though, it blocks
>>> indefinitely. Attaching to the QEMU monitor and force quitting the guest
>>> allows the cleanup to continue.
>>> There's no reason why the FlatView recalculation can't occur right
>>> away
>>> when we delete the blob subregion, however. It does not, because when we
>>> create the subregion we set the object as its own parent:
>>>       memory_region_init_ram_ptr(mr, OBJECT(mr), "blob", size, data);
>>> The extra reference is what prevents freeing the memory region
>>> object in
>>> the memory transaction of deleting the subregion.
>>> This commit changes the owner object to the device, which removes
>>> the
>>> extra owner reference in the memory region and causes the MR to be
>>> freed right away in the main context.
>>> Acked-by: Michael S. Tsirkin <mst@redhat.com>
>>> Signed-off-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
>>> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
>>> Tested-by: Alex Bennée <alex.bennee@linaro.org>
>>> Message-Id: <20250410122643.1747913-3-manos.pitsidianakis@linaro.org>
>>> Cc: qemu-stable@nongnu.org
>>> ---
>>>    hw/display/virtio-gpu-virgl.c | 2 +-
>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>> diff --git a/hw/display/virtio-gpu-virgl.c
>>> b/hw/display/virtio-gpu-virgl.c
>>> index 71a7500de9..8fbe4e70cc 100644
>>> --- a/hw/display/virtio-gpu-virgl.c
>>> +++ b/hw/display/virtio-gpu-virgl.c
>>> @@ -112,7 +112,7 @@ virtio_gpu_virgl_map_resource_blob(VirtIOGPU *g,
>>>        vmr->g = g;
>>>        mr = g_new0(MemoryRegion, 1);
>>>    -    memory_region_init_ram_ptr(mr, OBJECT(mr), "blob", size,
>>> data);
>>> +    memory_region_init_ram_ptr(mr, OBJECT(g), "blob", size, data);
>>>        memory_region_add_subregion(&b->hostmem, offset, mr);
>>>        memory_region_set_enabled(mr, true);
>>>    
>>
>> I suggest dropping this patch for now due to the reason I pointed out
>> for the first version of this series.
> 
> This fixes an actual bug - without it we get a hang.
> 

I understand that but it also introduces a regression; "[PATCH v3 14/20] 
ui/gtk-gl-area: Remove extra draw call in refresh" is also a similar case.

Ideally such a bug should be fixed without regression, but I understand 
it is sometimes difficult to do that and postponing the bug resolution 
until figuring out the correct way does not make sense.

In such a case, a bug should be fixed minimizing the regression and the 
documentation of the regression should be left in the code.

In particular, this patch can cause use-after-free whether TCG is used 
or not. Instead, I suggest to avoid freeing memory regions at all on 
TCG. It will surely leak memory, but won't result in use-after-free at 
least and the other accelerators will be unaffected.

Regards,
Akihiko Odaki

Re: [PATCH v3 12/20] virtio-gpu: fix hang under TCG when unmapping blob

Posted by Manos Pitsidianakis 8 months, 3 weeks ago

On Thu, May 22, 2025 at 10:03 AM Akihiko Odaki <akihiko.odaki@daynix.com> wrote:
>
> On 2025/05/22 15:45, Alex Bennée wrote:
> > Akihiko Odaki <akihiko.odaki@daynix.com> writes:
> >
> >> On 2025/05/22 1:42, Alex Bennée wrote:
> >>> From: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
> >>> This commit fixes an indefinite hang when using VIRTIO GPU blob
> >>> objects
> >>> under TCG in certain conditions.
> >>> The VIRTIO_GPU_CMD_RESOURCE_MAP_BLOB VIRTIO command creates a
> >>> MemoryRegion and attaches it to an offset on a PCI BAR of the
> >>> VirtIOGPUdevice. The VIRTIO_GPU_CMD_RESOURCE_UNMAP_BLOB command unmaps
> >>> it.
> >>> Because virglrenderer commands are not thread-safe they are only
> >>> called on the main context and QEMU performs the cleanup in three steps
> >>> to prevent a use-after-free scenario where the guest can access the
> >>> region after it’s unmapped:
> >>> 1. From the main context, the region’s field finish_unmapping is
> >>> false
> >>>      by default, so it sets a variable cmd_suspended, increases the
> >>>      renderer_blocked variable, deletes the blob subregion, and unparents
> >>>      the blob subregion causing its reference count to decrement.
> >>> 2. From an RCU context, the MemoryView gets freed, the FlatView gets
> >>>      recalculated, the free callback of the blob region
> >>>      virtio_gpu_virgl_hostmem_region_free is called which sets the
> >>>      region’s field finish_unmapping to true, allowing the main thread
> >>>      context to finish replying to the command
> >>> 3. From the main context, the command is processed again, but this
> >>> time
> >>>      finish_unmapping is true, so virgl_renderer_resource_unmap can be
> >>>      called and a response is sent to the guest.
> >>> It happens so that under TCG, if the guest has no timers configured
> >>> (and
> >>> thus no interrupt will cause the CPU to exit), the RCU thread does not
> >>> have enough time to grab the locks and recalculate the FlatView.
> >>> That’s not a big problem in practice since most guests will assume a
> >>> response will happen later in time and go on to do different things,
> >>> potentially triggering interrupts and allowing the RCU context to run.
> >>> If the guest waits for the unmap command to complete though, it blocks
> >>> indefinitely. Attaching to the QEMU monitor and force quitting the guest
> >>> allows the cleanup to continue.
> >>> There's no reason why the FlatView recalculation can't occur right
> >>> away
> >>> when we delete the blob subregion, however. It does not, because when we
> >>> create the subregion we set the object as its own parent:
> >>>       memory_region_init_ram_ptr(mr, OBJECT(mr), "blob", size, data);
> >>> The extra reference is what prevents freeing the memory region
> >>> object in
> >>> the memory transaction of deleting the subregion.
> >>> This commit changes the owner object to the device, which removes
> >>> the
> >>> extra owner reference in the memory region and causes the MR to be
> >>> freed right away in the main context.
> >>> Acked-by: Michael S. Tsirkin <mst@redhat.com>
> >>> Signed-off-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
> >>> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
> >>> Tested-by: Alex Bennée <alex.bennee@linaro.org>
> >>> Message-Id: <20250410122643.1747913-3-manos.pitsidianakis@linaro.org>
> >>> Cc: qemu-stable@nongnu.org
> >>> ---
> >>>    hw/display/virtio-gpu-virgl.c | 2 +-
> >>>    1 file changed, 1 insertion(+), 1 deletion(-)
> >>> diff --git a/hw/display/virtio-gpu-virgl.c
> >>> b/hw/display/virtio-gpu-virgl.c
> >>> index 71a7500de9..8fbe4e70cc 100644
> >>> --- a/hw/display/virtio-gpu-virgl.c
> >>> +++ b/hw/display/virtio-gpu-virgl.c
> >>> @@ -112,7 +112,7 @@ virtio_gpu_virgl_map_resource_blob(VirtIOGPU *g,
> >>>        vmr->g = g;
> >>>        mr = g_new0(MemoryRegion, 1);
> >>>    -    memory_region_init_ram_ptr(mr, OBJECT(mr), "blob", size,
> >>> data);
> >>> +    memory_region_init_ram_ptr(mr, OBJECT(g), "blob", size, data);
> >>>        memory_region_add_subregion(&b->hostmem, offset, mr);
> >>>        memory_region_set_enabled(mr, true);
> >>>
> >>
> >> I suggest dropping this patch for now due to the reason I pointed out
> >> for the first version of this series.
> >
> > This fixes an actual bug - without it we get a hang.
> >
>
> I understand that but it also introduces a regression; "[PATCH v3 14/20]
> ui/gtk-gl-area: Remove extra draw call in refresh" is also a similar case.
>
> Ideally such a bug should be fixed without regression, but I understand
> it is sometimes difficult to do that and postponing the bug resolution
> until figuring out the correct way does not make sense.
>
> In such a case, a bug should be fixed minimizing the regression and the
> documentation of the regression should be left in the code.
>
> In particular, this patch can cause use-after-free whether TCG is used
> or not. Instead, I suggest to avoid freeing memory regions at all on
> TCG. It will surely leak memory, but won't result in use-after-free at
> least and the other accelerators will be unaffected.
>
> Regards,
> Akihiko Odaki

We tested this fix with ASAN and didn't see anything. Do you have a
test case in mind that can reproduce this use-after-free? It'd help
make a certain decision on whether to drop this patch or not. I'm not
doubting that this can cause a use-after-free by the way, it's just
that it is hypothetical only. If it causes a use-after-free for sure
we should definitely drop it.

> Instead, I suggest to avoid freeing memory regions at all on
> TCG. It will surely leak memory, but won't result in use-after-free at
> least and the other accelerators will be unaffected.

Leaking memory for blob objects is also not ideal, since they are
frequently allocated. It's memory-safe but the leak can accumulate
over time.

-- 
Manos Pitsidianakis
Emulation and Virtualization Engineer at Linaro Ltd

Re: [PATCH v3 12/20] virtio-gpu: fix hang under TCG when unmapping blob

Posted by Akihiko Odaki 8 months, 3 weeks ago

On 2025/05/22 16:31, Manos Pitsidianakis wrote:
> On Thu, May 22, 2025 at 10:03 AM Akihiko Odaki <akihiko.odaki@daynix.com> wrote:
>>
>> On 2025/05/22 15:45, Alex Bennée wrote:
>>> Akihiko Odaki <akihiko.odaki@daynix.com> writes:
>>>
>>>> On 2025/05/22 1:42, Alex Bennée wrote:
>>>>> From: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
>>>>> This commit fixes an indefinite hang when using VIRTIO GPU blob
>>>>> objects
>>>>> under TCG in certain conditions.
>>>>> The VIRTIO_GPU_CMD_RESOURCE_MAP_BLOB VIRTIO command creates a
>>>>> MemoryRegion and attaches it to an offset on a PCI BAR of the
>>>>> VirtIOGPUdevice. The VIRTIO_GPU_CMD_RESOURCE_UNMAP_BLOB command unmaps
>>>>> it.
>>>>> Because virglrenderer commands are not thread-safe they are only
>>>>> called on the main context and QEMU performs the cleanup in three steps
>>>>> to prevent a use-after-free scenario where the guest can access the
>>>>> region after it’s unmapped:
>>>>> 1. From the main context, the region’s field finish_unmapping is
>>>>> false
>>>>>       by default, so it sets a variable cmd_suspended, increases the
>>>>>       renderer_blocked variable, deletes the blob subregion, and unparents
>>>>>       the blob subregion causing its reference count to decrement.
>>>>> 2. From an RCU context, the MemoryView gets freed, the FlatView gets
>>>>>       recalculated, the free callback of the blob region
>>>>>       virtio_gpu_virgl_hostmem_region_free is called which sets the
>>>>>       region’s field finish_unmapping to true, allowing the main thread
>>>>>       context to finish replying to the command
>>>>> 3. From the main context, the command is processed again, but this
>>>>> time
>>>>>       finish_unmapping is true, so virgl_renderer_resource_unmap can be
>>>>>       called and a response is sent to the guest.
>>>>> It happens so that under TCG, if the guest has no timers configured
>>>>> (and
>>>>> thus no interrupt will cause the CPU to exit), the RCU thread does not
>>>>> have enough time to grab the locks and recalculate the FlatView.
>>>>> That’s not a big problem in practice since most guests will assume a
>>>>> response will happen later in time and go on to do different things,
>>>>> potentially triggering interrupts and allowing the RCU context to run.
>>>>> If the guest waits for the unmap command to complete though, it blocks
>>>>> indefinitely. Attaching to the QEMU monitor and force quitting the guest
>>>>> allows the cleanup to continue.
>>>>> There's no reason why the FlatView recalculation can't occur right
>>>>> away
>>>>> when we delete the blob subregion, however. It does not, because when we
>>>>> create the subregion we set the object as its own parent:
>>>>>        memory_region_init_ram_ptr(mr, OBJECT(mr), "blob", size, data);
>>>>> The extra reference is what prevents freeing the memory region
>>>>> object in
>>>>> the memory transaction of deleting the subregion.
>>>>> This commit changes the owner object to the device, which removes
>>>>> the
>>>>> extra owner reference in the memory region and causes the MR to be
>>>>> freed right away in the main context.
>>>>> Acked-by: Michael S. Tsirkin <mst@redhat.com>
>>>>> Signed-off-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
>>>>> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
>>>>> Tested-by: Alex Bennée <alex.bennee@linaro.org>
>>>>> Message-Id: <20250410122643.1747913-3-manos.pitsidianakis@linaro.org>
>>>>> Cc: qemu-stable@nongnu.org
>>>>> ---
>>>>>     hw/display/virtio-gpu-virgl.c | 2 +-
>>>>>     1 file changed, 1 insertion(+), 1 deletion(-)
>>>>> diff --git a/hw/display/virtio-gpu-virgl.c
>>>>> b/hw/display/virtio-gpu-virgl.c
>>>>> index 71a7500de9..8fbe4e70cc 100644
>>>>> --- a/hw/display/virtio-gpu-virgl.c
>>>>> +++ b/hw/display/virtio-gpu-virgl.c
>>>>> @@ -112,7 +112,7 @@ virtio_gpu_virgl_map_resource_blob(VirtIOGPU *g,
>>>>>         vmr->g = g;
>>>>>         mr = g_new0(MemoryRegion, 1);
>>>>>     -    memory_region_init_ram_ptr(mr, OBJECT(mr), "blob", size,
>>>>> data);
>>>>> +    memory_region_init_ram_ptr(mr, OBJECT(g), "blob", size, data);
>>>>>         memory_region_add_subregion(&b->hostmem, offset, mr);
>>>>>         memory_region_set_enabled(mr, true);
>>>>>
>>>>
>>>> I suggest dropping this patch for now due to the reason I pointed out
>>>> for the first version of this series.
>>>
>>> This fixes an actual bug - without it we get a hang.
>>>
>>
>> I understand that but it also introduces a regression; "[PATCH v3 14/20]
>> ui/gtk-gl-area: Remove extra draw call in refresh" is also a similar case.
>>
>> Ideally such a bug should be fixed without regression, but I understand
>> it is sometimes difficult to do that and postponing the bug resolution
>> until figuring out the correct way does not make sense.
>>
>> In such a case, a bug should be fixed minimizing the regression and the
>> documentation of the regression should be left in the code.
>>
>> In particular, this patch can cause use-after-free whether TCG is used
>> or not. Instead, I suggest to avoid freeing memory regions at all on
>> TCG. It will surely leak memory, but won't result in use-after-free at
>> least and the other accelerators will be unaffected.
>>
>> Regards,
>> Akihiko Odaki
> 
> We tested this fix with ASAN and didn't see anything. Do you have a
> test case in mind that can reproduce this use-after-free? It'd help
> make a certain decision on whether to drop this patch or not. I'm not
> doubting that this can cause a use-after-free by the way, it's just
> that it is hypothetical only. If it causes a use-after-free for sure
> we should definitely drop it.

No, I don't have a test case and it should rarely occur. More 
concretely, a UAF occurs if the guest accesses the memory region while 
trying to unmap it. It is just a theory indeed, but the theory says the 
UAF is possible.

> 
>> Instead, I suggest to avoid freeing memory regions at all on
>> TCG. It will surely leak memory, but won't result in use-after-free at
>> least and the other accelerators will be unaffected.
> 
> Leaking memory for blob objects is also not ideal, since they are
> frequently allocated. It's memory-safe but the leak can accumulate
> over time.
 >

Memory safety and leak free cannot be compatible unless RCU is fixed. We 
need to choose either of them.

Regards,
Akihiko Odaki

Re: [PATCH v3 12/20] virtio-gpu: fix hang under TCG when unmapping blob

Posted by Alex Bennée 8 months, 3 weeks ago

Akihiko Odaki <akihiko.odaki@daynix.com> writes:

> On 2025/05/22 16:31, Manos Pitsidianakis wrote:
>> On Thu, May 22, 2025 at 10:03 AM Akihiko Odaki <akihiko.odaki@daynix.com> wrote:
>>>
>>> On 2025/05/22 15:45, Alex Bennée wrote:
>>>> Akihiko Odaki <akihiko.odaki@daynix.com> writes:
>>>>
>>>>> On 2025/05/22 1:42, Alex Bennée wrote:
>>>>>> From: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
>>>>>> This commit fixes an indefinite hang when using VIRTIO GPU blob
>>>>>> objects
>>>>>> under TCG in certain conditions.
>>>>>> The VIRTIO_GPU_CMD_RESOURCE_MAP_BLOB VIRTIO command creates a
>>>>>> MemoryRegion and attaches it to an offset on a PCI BAR of the
>>>>>> VirtIOGPUdevice. The VIRTIO_GPU_CMD_RESOURCE_UNMAP_BLOB command unmaps
>>>>>> it.
>>>>>> Because virglrenderer commands are not thread-safe they are only
>>>>>> called on the main context and QEMU performs the cleanup in three steps
>>>>>> to prevent a use-after-free scenario where the guest can access the
>>>>>> region after it’s unmapped:
>>>>>> 1. From the main context, the region’s field finish_unmapping is
>>>>>> false
>>>>>>       by default, so it sets a variable cmd_suspended, increases the
>>>>>>       renderer_blocked variable, deletes the blob subregion, and unparents
>>>>>>       the blob subregion causing its reference count to decrement.
>>>>>> 2. From an RCU context, the MemoryView gets freed, the FlatView gets
>>>>>>       recalculated, the free callback of the blob region
>>>>>>       virtio_gpu_virgl_hostmem_region_free is called which sets the
>>>>>>       region’s field finish_unmapping to true, allowing the main thread
>>>>>>       context to finish replying to the command
>>>>>> 3. From the main context, the command is processed again, but this
>>>>>> time
>>>>>>       finish_unmapping is true, so virgl_renderer_resource_unmap can be
>>>>>>       called and a response is sent to the guest.
>>>>>> It happens so that under TCG, if the guest has no timers configured
>>>>>> (and
>>>>>> thus no interrupt will cause the CPU to exit), the RCU thread does not
>>>>>> have enough time to grab the locks and recalculate the FlatView.
>>>>>> That’s not a big problem in practice since most guests will assume a
>>>>>> response will happen later in time and go on to do different things,
>>>>>> potentially triggering interrupts and allowing the RCU context to run.
>>>>>> If the guest waits for the unmap command to complete though, it blocks
>>>>>> indefinitely. Attaching to the QEMU monitor and force quitting the guest
>>>>>> allows the cleanup to continue.
>>>>>> There's no reason why the FlatView recalculation can't occur right
>>>>>> away
>>>>>> when we delete the blob subregion, however. It does not, because when we
>>>>>> create the subregion we set the object as its own parent:
>>>>>>        memory_region_init_ram_ptr(mr, OBJECT(mr), "blob", size, data);
>>>>>> The extra reference is what prevents freeing the memory region
>>>>>> object in
>>>>>> the memory transaction of deleting the subregion.
>>>>>> This commit changes the owner object to the device, which removes
>>>>>> the
>>>>>> extra owner reference in the memory region and causes the MR to be
>>>>>> freed right away in the main context.
>>>>>> Acked-by: Michael S. Tsirkin <mst@redhat.com>
>>>>>> Signed-off-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
>>>>>> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
>>>>>> Tested-by: Alex Bennée <alex.bennee@linaro.org>
>>>>>> Message-Id: <20250410122643.1747913-3-manos.pitsidianakis@linaro.org>
>>>>>> Cc: qemu-stable@nongnu.org
>>>>>> ---
>>>>>>     hw/display/virtio-gpu-virgl.c | 2 +-
>>>>>>     1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>> diff --git a/hw/display/virtio-gpu-virgl.c
>>>>>> b/hw/display/virtio-gpu-virgl.c
>>>>>> index 71a7500de9..8fbe4e70cc 100644
>>>>>> --- a/hw/display/virtio-gpu-virgl.c
>>>>>> +++ b/hw/display/virtio-gpu-virgl.c
>>>>>> @@ -112,7 +112,7 @@ virtio_gpu_virgl_map_resource_blob(VirtIOGPU *g,
>>>>>>         vmr->g = g;
>>>>>>         mr = g_new0(MemoryRegion, 1);
>>>>>>     -    memory_region_init_ram_ptr(mr, OBJECT(mr), "blob", size,
>>>>>> data);
>>>>>> +    memory_region_init_ram_ptr(mr, OBJECT(g), "blob", size, data);
>>>>>>         memory_region_add_subregion(&b->hostmem, offset, mr);
>>>>>>         memory_region_set_enabled(mr, true);
>>>>>>
>>>>>
>>>>> I suggest dropping this patch for now due to the reason I pointed out
>>>>> for the first version of this series.
>>>>
>>>> This fixes an actual bug - without it we get a hang.
>>>>
>>>
>>> I understand that but it also introduces a regression; "[PATCH v3 14/20]
>>> ui/gtk-gl-area: Remove extra draw call in refresh" is also a similar case.
>>>
>>> Ideally such a bug should be fixed without regression, but I understand
>>> it is sometimes difficult to do that and postponing the bug resolution
>>> until figuring out the correct way does not make sense.
>>>
>>> In such a case, a bug should be fixed minimizing the regression and the
>>> documentation of the regression should be left in the code.
>>>
>>> In particular, this patch can cause use-after-free whether TCG is used
>>> or not. Instead, I suggest to avoid freeing memory regions at all on
>>> TCG. It will surely leak memory, but won't result in use-after-free at
>>> least and the other accelerators will be unaffected.
>>>
>>> Regards,
>>> Akihiko Odaki
>> We tested this fix with ASAN and didn't see anything. Do you have a
>> test case in mind that can reproduce this use-after-free? It'd help
>> make a certain decision on whether to drop this patch or not. I'm not
>> doubting that this can cause a use-after-free by the way, it's just
>> that it is hypothetical only. If it causes a use-after-free for sure
>> we should definitely drop it.
>
> No, I don't have a test case and it should rarely occur. More
> concretely, a UAF occurs if the guest accesses the memory region while
> trying to unmap it. It is just a theory indeed, but the theory says
> the UAF is possible.

I have a test case this fixes which I think trumps a theoretical UAF
without a test case.

Why would the guest attempt to access after triggering the free itself?
Wouldn't it be correct to fault the guest for violating its own memory
safety rules?

>>> Instead, I suggest to avoid freeing memory regions at all on
>>> TCG. It will surely leak memory, but won't result in use-after-free at
>>> least and the other accelerators will be unaffected.
>> Leaking memory for blob objects is also not ideal, since they are
>> frequently allocated. It's memory-safe but the leak can accumulate
>> over time.
>>
>
> Memory safety and leak free cannot be compatible unless RCU is fixed.
> We need to choose either of them.

How can the guest access something that is now unmapped? The RCU should
only run after the flatview has been updated.

>
> Regards,
> Akihiko Odaki

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro

Re: [PATCH v3 12/20] virtio-gpu: fix hang under TCG when unmapping blob

Posted by Akihiko Odaki 8 months, 3 weeks ago

On 2025/05/22 18:28, Alex Bennée wrote:
> Akihiko Odaki <akihiko.odaki@daynix.com> writes:
> 
>> On 2025/05/22 16:31, Manos Pitsidianakis wrote:
>>> On Thu, May 22, 2025 at 10:03 AM Akihiko Odaki <akihiko.odaki@daynix.com> wrote:
>>>>
>>>> On 2025/05/22 15:45, Alex Bennée wrote:
>>>>> Akihiko Odaki <akihiko.odaki@daynix.com> writes:
>>>>>
>>>>>> On 2025/05/22 1:42, Alex Bennée wrote:
>>>>>>> From: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
>>>>>>> This commit fixes an indefinite hang when using VIRTIO GPU blob
>>>>>>> objects
>>>>>>> under TCG in certain conditions.
>>>>>>> The VIRTIO_GPU_CMD_RESOURCE_MAP_BLOB VIRTIO command creates a
>>>>>>> MemoryRegion and attaches it to an offset on a PCI BAR of the
>>>>>>> VirtIOGPUdevice. The VIRTIO_GPU_CMD_RESOURCE_UNMAP_BLOB command unmaps
>>>>>>> it.
>>>>>>> Because virglrenderer commands are not thread-safe they are only
>>>>>>> called on the main context and QEMU performs the cleanup in three steps
>>>>>>> to prevent a use-after-free scenario where the guest can access the
>>>>>>> region after it’s unmapped:
>>>>>>> 1. From the main context, the region’s field finish_unmapping is
>>>>>>> false
>>>>>>>        by default, so it sets a variable cmd_suspended, increases the
>>>>>>>        renderer_blocked variable, deletes the blob subregion, and unparents
>>>>>>>        the blob subregion causing its reference count to decrement.
>>>>>>> 2. From an RCU context, the MemoryView gets freed, the FlatView gets
>>>>>>>        recalculated, the free callback of the blob region
>>>>>>>        virtio_gpu_virgl_hostmem_region_free is called which sets the
>>>>>>>        region’s field finish_unmapping to true, allowing the main thread
>>>>>>>        context to finish replying to the command
>>>>>>> 3. From the main context, the command is processed again, but this
>>>>>>> time
>>>>>>>        finish_unmapping is true, so virgl_renderer_resource_unmap can be
>>>>>>>        called and a response is sent to the guest.
>>>>>>> It happens so that under TCG, if the guest has no timers configured
>>>>>>> (and
>>>>>>> thus no interrupt will cause the CPU to exit), the RCU thread does not
>>>>>>> have enough time to grab the locks and recalculate the FlatView.
>>>>>>> That’s not a big problem in practice since most guests will assume a
>>>>>>> response will happen later in time and go on to do different things,
>>>>>>> potentially triggering interrupts and allowing the RCU context to run.
>>>>>>> If the guest waits for the unmap command to complete though, it blocks
>>>>>>> indefinitely. Attaching to the QEMU monitor and force quitting the guest
>>>>>>> allows the cleanup to continue.
>>>>>>> There's no reason why the FlatView recalculation can't occur right
>>>>>>> away
>>>>>>> when we delete the blob subregion, however. It does not, because when we
>>>>>>> create the subregion we set the object as its own parent:
>>>>>>>         memory_region_init_ram_ptr(mr, OBJECT(mr), "blob", size, data);
>>>>>>> The extra reference is what prevents freeing the memory region
>>>>>>> object in
>>>>>>> the memory transaction of deleting the subregion.
>>>>>>> This commit changes the owner object to the device, which removes
>>>>>>> the
>>>>>>> extra owner reference in the memory region and causes the MR to be
>>>>>>> freed right away in the main context.
>>>>>>> Acked-by: Michael S. Tsirkin <mst@redhat.com>
>>>>>>> Signed-off-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
>>>>>>> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
>>>>>>> Tested-by: Alex Bennée <alex.bennee@linaro.org>
>>>>>>> Message-Id: <20250410122643.1747913-3-manos.pitsidianakis@linaro.org>
>>>>>>> Cc: qemu-stable@nongnu.org
>>>>>>> ---
>>>>>>>      hw/display/virtio-gpu-virgl.c | 2 +-
>>>>>>>      1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>> diff --git a/hw/display/virtio-gpu-virgl.c
>>>>>>> b/hw/display/virtio-gpu-virgl.c
>>>>>>> index 71a7500de9..8fbe4e70cc 100644
>>>>>>> --- a/hw/display/virtio-gpu-virgl.c
>>>>>>> +++ b/hw/display/virtio-gpu-virgl.c
>>>>>>> @@ -112,7 +112,7 @@ virtio_gpu_virgl_map_resource_blob(VirtIOGPU *g,
>>>>>>>          vmr->g = g;
>>>>>>>          mr = g_new0(MemoryRegion, 1);
>>>>>>>      -    memory_region_init_ram_ptr(mr, OBJECT(mr), "blob", size,
>>>>>>> data);
>>>>>>> +    memory_region_init_ram_ptr(mr, OBJECT(g), "blob", size, data);
>>>>>>>          memory_region_add_subregion(&b->hostmem, offset, mr);
>>>>>>>          memory_region_set_enabled(mr, true);
>>>>>>>
>>>>>>
>>>>>> I suggest dropping this patch for now due to the reason I pointed out
>>>>>> for the first version of this series.
>>>>>
>>>>> This fixes an actual bug - without it we get a hang.
>>>>>
>>>>
>>>> I understand that but it also introduces a regression; "[PATCH v3 14/20]
>>>> ui/gtk-gl-area: Remove extra draw call in refresh" is also a similar case.
>>>>
>>>> Ideally such a bug should be fixed without regression, but I understand
>>>> it is sometimes difficult to do that and postponing the bug resolution
>>>> until figuring out the correct way does not make sense.
>>>>
>>>> In such a case, a bug should be fixed minimizing the regression and the
>>>> documentation of the regression should be left in the code.
>>>>
>>>> In particular, this patch can cause use-after-free whether TCG is used
>>>> or not. Instead, I suggest to avoid freeing memory regions at all on
>>>> TCG. It will surely leak memory, but won't result in use-after-free at
>>>> least and the other accelerators will be unaffected.
>>>>
>>>> Regards,
>>>> Akihiko Odaki
>>> We tested this fix with ASAN and didn't see anything. Do you have a
>>> test case in mind that can reproduce this use-after-free? It'd help
>>> make a certain decision on whether to drop this patch or not. I'm not
>>> doubting that this can cause a use-after-free by the way, it's just
>>> that it is hypothetical only. If it causes a use-after-free for sure
>>> we should definitely drop it.
>>
>> No, I don't have a test case and it should rarely occur. More
>> concretely, a UAF occurs if the guest accesses the memory region while
>> trying to unmap it. It is just a theory indeed, but the theory says
>> the UAF is possible.
> 
> I have a test case this fixes which I think trumps a theoretical UAF
> without a test case.
> 
> Why would the guest attempt to access after triggering the free itself?
> Wouldn't it be correct to fault the guest for violating its own memory
> safety rules?

docs/devel/secure-coding-practices.rst says "Unexpected accesses must 
not cause memory corruption or leaks in QEMU".

I'm not completely sure whether it is safe without concurrent accesses 
either. In particular, KVM does not immediately update the guest memory 
mapping, so it may result in a time window where the guest memory is 
mapped to an unmapped host memory region, and I suspect that could cause 
a problem. That also motivates limiting the scope of the change to TCG.

> 
>>>> Instead, I suggest to avoid freeing memory regions at all on
>>>> TCG. It will surely leak memory, but won't result in use-after-free at
>>>> least and the other accelerators will be unaffected.
>>> Leaking memory for blob objects is also not ideal, since they are
>>> frequently allocated. It's memory-safe but the leak can accumulate
>>> over time.
>>>
>>
>> Memory safety and leak free cannot be compatible unless RCU is fixed.
>> We need to choose either of them.
> 
> How can the guest access something that is now unmapped? The RCU should
> only run after the flatview has been updated.

This patch bypasses RCU. That's why it solves the hang even though the 
RCU itself is not fixed.

Let me summarize the theory and the actual behavior below:

The theory is that RCU satisfies the common requirement of concurrent 
algorithms. More concretely:
1) It is race-free; for RCU, it means it prevents use-after-free.
2) It does not prevent forward progress.

The patch message suggests 2) is not satisfied. A proper fix would be to 
change RCU to satisfy 2).

However, this patch workarounds the problem by circumventing RCU, which 
solves 2) but it regresses 1).

My suggestion is to document and to limit the impact of 1) by:
a) Limiting the scope of the change to TCG.
b) Not freeing memory regions, which will prevent use-after-free while 
leaking memory.

Manos said b) can be problematic because mappings are frequently 
created. Whether b) makes sense or not depends on the probability and 
impact of UAF and memory leak.

Regards,
Akihiko Odaki

Re: [PATCH v3 12/20] virtio-gpu: fix hang under TCG when unmapping blob

Posted by Alex Bennée 8 months, 2 weeks ago

Akihiko Odaki <akihiko.odaki@daynix.com> writes:

> On 2025/05/22 18:28, Alex Bennée wrote:
>> Akihiko Odaki <akihiko.odaki@daynix.com> writes:
>> 
>>> On 2025/05/22 16:31, Manos Pitsidianakis wrote:
>>>> On Thu, May 22, 2025 at 10:03 AM Akihiko Odaki <akihiko.odaki@daynix.com> wrote:
>>>>>
>>>>> On 2025/05/22 15:45, Alex Bennée wrote:
>>>>>> Akihiko Odaki <akihiko.odaki@daynix.com> writes:
>>>>>>
>>>>>>> On 2025/05/22 1:42, Alex Bennée wrote:
>>>>>>>> From: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
<snip>
>>>>> In such a case, a bug should be fixed minimizing the regression and the
>>>>> documentation of the regression should be left in the code.
>>>>>
>>>>> In particular, this patch can cause use-after-free whether TCG is used
>>>>> or not. Instead, I suggest to avoid freeing memory regions at all on
>>>>> TCG. It will surely leak memory, but won't result in use-after-free at
>>>>> least and the other accelerators will be unaffected.
>>>>>
>>>>> Regards,
>>>>> Akihiko Odaki
>>>> We tested this fix with ASAN and didn't see anything. Do you have a
>>>> test case in mind that can reproduce this use-after-free? It'd help
>>>> make a certain decision on whether to drop this patch or not. I'm not
>>>> doubting that this can cause a use-after-free by the way, it's just
>>>> that it is hypothetical only. If it causes a use-after-free for sure
>>>> we should definitely drop it.
>>>
>>> No, I don't have a test case and it should rarely occur. More
>>> concretely, a UAF occurs if the guest accesses the memory region while
>>> trying to unmap it. It is just a theory indeed, but the theory says
>>> the UAF is possible.
>> I have a test case this fixes which I think trumps a theoretical UAF
>> without a test case.
>> Why would the guest attempt to access after triggering the free
>> itself?
>> Wouldn't it be correct to fault the guest for violating its own memory
>> safety rules?
>
> docs/devel/secure-coding-practices.rst says "Unexpected accesses must
> not cause memory corruption or leaks in QEMU".

Agreed.

> I'm not completely sure whether it is safe without concurrent accesses
> either. In particular, KVM does not immediately update the guest
> memory mapping, so it may result in a time window where the guest
> memory is mapped to an unmapped host memory region, and I suspect that
> could cause a problem. That also motivates limiting the scope of the
> change to TCG.

Surely it does:

        memory_region_set_enabled(mr, false);
        memory_region_del_subregion(&b->hostmem, mr);

will trigger a rebuilding of the flatview - so after the memory region
is deleted any guest access should trigger a fault to the guest. Only
then do we unparent the memory region and finish the clean-up.

I don't think we want to have different paths for KVM and TCG here as it
will further complicate already complicated code. 

>>>>> Instead, I suggest to avoid freeing memory regions at all on
>>>>> TCG. It will surely leak memory, but won't result in use-after-free at
>>>>> least and the other accelerators will be unaffected.
>>>> Leaking memory for blob objects is also not ideal, since they are
>>>> frequently allocated. It's memory-safe but the leak can accumulate
>>>> over time.
>>>>
>>>
>>> Memory safety and leak free cannot be compatible unless RCU is fixed.
>>> We need to choose either of them.
>> How can the guest access something that is now unmapped? The RCU
>> should
>> only run after the flatview has been updated.
>
> This patch bypasses RCU. That's why it solves the hang even though the
> RCU itself is not fixed.
>
> Let me summarize the theory and the actual behavior below:
>
> The theory is that RCU satisfies the common requirement of concurrent
> algorithms. More concretely:
> 1) It is race-free; for RCU, it means it prevents use-after-free.
> 2) It does not prevent forward progress.
>
> The patch message suggests 2) is not satisfied. A proper fix would be
> to change RCU to satisfy 2).

Its mutually incompatible with virglrenderer - we have to block all
commands until the virgl resource is freed and we can't do that until
the memory region is unplugged.

So yes we do bypass RCU for this but by explicitly un-parenting the
resource once the mapping has been removed.

> However, this patch workarounds the problem by circumventing RCU,
> which solves 2) but it regresses 1).

I'm still not seeing how this happens and without a test case to
demonstrate it happening I can't hold this patch in limbo forever.

> My suggestion is to document and to limit the impact of 1) by:
> a) Limiting the scope of the change to TCG.
> b) Not freeing memory regions, which will prevent use-after-free while
> leaking memory.
>
> Manos said b) can be problematic because mappings are frequently
> created. Whether b) makes sense or not depends on the probability and
> impact of UAF and memory leak

Not freeing memory regions would lead to a DoS attack instead. I don't
think we can just accumulate region like that.

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro

Re: [PATCH v3 12/20] virtio-gpu: fix hang under TCG when unmapping blob

Posted by Akihiko Odaki 8 months, 2 weeks ago

On 2025/05/27 19:05, Alex Bennée wrote:
> Akihiko Odaki <akihiko.odaki@daynix.com> writes:
> 
>> On 2025/05/22 18:28, Alex Bennée wrote:
>>> Akihiko Odaki <akihiko.odaki@daynix.com> writes:
>>>
>>>> On 2025/05/22 16:31, Manos Pitsidianakis wrote:
>>>>> On Thu, May 22, 2025 at 10:03 AM Akihiko Odaki <akihiko.odaki@daynix.com> wrote:
>>>>>>
>>>>>> On 2025/05/22 15:45, Alex Bennée wrote:
>>>>>>> Akihiko Odaki <akihiko.odaki@daynix.com> writes:
>>>>>>>
>>>>>>>> On 2025/05/22 1:42, Alex Bennée wrote:
>>>>>>>>> From: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
> <snip>
>>>>>> In such a case, a bug should be fixed minimizing the regression and the
>>>>>> documentation of the regression should be left in the code.
>>>>>>
>>>>>> In particular, this patch can cause use-after-free whether TCG is used
>>>>>> or not. Instead, I suggest to avoid freeing memory regions at all on
>>>>>> TCG. It will surely leak memory, but won't result in use-after-free at
>>>>>> least and the other accelerators will be unaffected.
>>>>>>
>>>>>> Regards,
>>>>>> Akihiko Odaki
>>>>> We tested this fix with ASAN and didn't see anything. Do you have a
>>>>> test case in mind that can reproduce this use-after-free? It'd help
>>>>> make a certain decision on whether to drop this patch or not. I'm not
>>>>> doubting that this can cause a use-after-free by the way, it's just
>>>>> that it is hypothetical only. If it causes a use-after-free for sure
>>>>> we should definitely drop it.
>>>>
>>>> No, I don't have a test case and it should rarely occur. More
>>>> concretely, a UAF occurs if the guest accesses the memory region while
>>>> trying to unmap it. It is just a theory indeed, but the theory says
>>>> the UAF is possible.
>>> I have a test case this fixes which I think trumps a theoretical UAF
>>> without a test case.
>>> Why would the guest attempt to access after triggering the free
>>> itself?
>>> Wouldn't it be correct to fault the guest for violating its own memory
>>> safety rules?
>>
>> docs/devel/secure-coding-practices.rst says "Unexpected accesses must
>> not cause memory corruption or leaks in QEMU".
> 
> Agreed.
> 
>> I'm not completely sure whether it is safe without concurrent accesses
>> either. In particular, KVM does not immediately update the guest
>> memory mapping, so it may result in a time window where the guest
>> memory is mapped to an unmapped host memory region, and I suspect that
>> could cause a problem. That also motivates limiting the scope of the
>> change to TCG.
> 
> Surely it does:
> 
>          memory_region_set_enabled(mr, false);
>          memory_region_del_subregion(&b->hostmem, mr);
> 
> will trigger a rebuilding of the flatview - so after the memory region
> is deleted any guest access should trigger a fault to the guest. Only
> then do we unparent the memory region and finish the clean-up.

I wrongly assumed it waits for RCU, but I found it is not true.

It is still not true that any guest access will trigger a fault to the 
guest. QEMU's internal AddressSpace keeps the old FlatView until the RCU 
changes, and some device may still have a DMA mapping.

> 
> I don't think we want to have different paths for KVM and TCG here as it
> will further complicate already complicated code.

Complexity is not a problem here. Any concurrent code has complexity 
needed for correctness and reliability. We should add complexity if 
needed for reliablity; otherwise, we should remove it (even if it's 
already simple).

> 
>>>>>> Instead, I suggest to avoid freeing memory regions at all on
>>>>>> TCG. It will surely leak memory, but won't result in use-after-free at
>>>>>> least and the other accelerators will be unaffected.
>>>>> Leaking memory for blob objects is also not ideal, since they are
>>>>> frequently allocated. It's memory-safe but the leak can accumulate
>>>>> over time.
>>>>>
>>>>
>>>> Memory safety and leak free cannot be compatible unless RCU is fixed.
>>>> We need to choose either of them.
>>> How can the guest access something that is now unmapped? The RCU
>>> should
>>> only run after the flatview has been updated.
>>
>> This patch bypasses RCU. That's why it solves the hang even though the
>> RCU itself is not fixed.
>>
>> Let me summarize the theory and the actual behavior below:
>>
>> The theory is that RCU satisfies the common requirement of concurrent
>> algorithms. More concretely:
>> 1) It is race-free; for RCU, it means it prevents use-after-free.
>> 2) It does not prevent forward progress.
>>
>> The patch message suggests 2) is not satisfied. A proper fix would be
>> to change RCU to satisfy 2).
> 
> Its mutually incompatible with virglrenderer - we have to block all
> commands until the virgl resource is freed and we can't do that until
> the memory region is unplugged.
> 
> So yes we do bypass RCU for this but by explicitly un-parenting the
> resource once the mapping has been removed.

The problem is that RCU doesn't free the memory region when all commands 
are blocked.

If we fix RCU, RCU will free the memory region, the virgl resource will 
be freed then, and the commands will be unblocked.

> 
>> However, this patch workarounds the problem by circumventing RCU,
>> which solves 2) but it regresses 1).
> 
> I'm still not seeing how this happens and without a test case to
> demonstrate it happening I can't hold this patch in limbo forever.

Not a test case that runs on the guest, but adding the following change 
to QEMU will demonstrate the issue:

diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c
index 145a0b387956..c97cd2dfd7b3 100644
--- a/hw/display/virtio-gpu-virgl.c
+++ b/hw/display/virtio-gpu-virgl.c
@@ -182,7 +182,11 @@ virtio_gpu_virgl_unmap_resource_blob(VirtIOGPU *g,
          /* memory region owns self res->mr object and frees it by 
itself */
          memory_region_set_enabled(mr, false);
          memory_region_del_subregion(&b->hostmem, mr);
+        memory_region_ref(OBJECT(mr));
+        int k = b->renderer_blocked;
          object_unparent(OBJECT(mr));
+        assert(k == b->renderer_blocked);
+        memory_region_unref(OBJECT(mr));
      }

      return 0;

memory_region_ref() and memory_region_unref() emulates the begin and end 
of a DMA operation on mr, respectively. In a real world scenario, 
address_space_map() and address_space_unmap() call these functions.

If the code is correct, object_unparent() should not free the memory 
region while the DMA operation is ongoing, so the renderer will be kept 
blocked. assert(k == b->renderer_blocked) checks that.

My proposal is to limit the impact of the regression as possible as we 
can and to document it, not to postpone the solution until figuring out 
the problem in RCU.

> 
>> My suggestion is to document and to limit the impact of 1) by:
>> a) Limiting the scope of the change to TCG.
>> b) Not freeing memory regions, which will prevent use-after-free while
>> leaking memory.
>>
>> Manos said b) can be problematic because mappings are frequently
>> created. Whether b) makes sense or not depends on the probability and
>> impact of UAF and memory leak
> 
> Not freeing memory regions would lead to a DoS attack instead. I don't
> think we can just accumulate region like that.

UAF also results in DoS, and it can have a bigger consequence due to 
memory corruption. Memory leak is usually safer than UAF.

The question is whether memory leak causes a practical problem; it makes 
sense to opt to UAF if the accumulation happens too quickly; we at least 
know that UAF is rare.

Regards,
Akihiko Odaki