[PATCH] hw/display/ramfb: plug slight guest-triggerable leak on mode setting

Laszlo Ersek posted 1 patch 7 months, 2 weeks ago
Failed in applying to current master (apply log)
hw/display/ramfb.c | 1 +
1 file changed, 1 insertion(+)
[PATCH] hw/display/ramfb: plug slight guest-triggerable leak on mode setting
Posted by Laszlo Ersek 7 months, 2 weeks ago
The fw_cfg DMA write callback in ramfb prepares a new display surface in
QEMU; this new surface is put to use ("swapped in") upon the next display
update. At that time, the old surface (if any) is released.

If the guest triggers the fw_cfg DMA write callback at least twice between
two adjacent display updates, then the second callback (and further such
callbacks) will leak the previously prepared (but not yet swapped in)
display surface.

The issue can be shown by:

(1) starting QEMU with "-trace displaysurface_free", and

(2) running the following program in the guest UEFI shell:

> #include <Library/ShellCEntryLib.h>           // ShellAppMain()
> #include <Library/UefiBootServicesTableLib.h> // gBS
> #include <Protocol/GraphicsOutput.h>          // EFI_GRAPHICS_OUTPUT_PROTOCOL
>
> INTN
> EFIAPI
> ShellAppMain (
>   IN UINTN   Argc,
>   IN CHAR16  **Argv
>   )
> {
>   EFI_STATUS                    Status;
>   VOID                          *Interface;
>   EFI_GRAPHICS_OUTPUT_PROTOCOL  *Gop;
>   UINT32                        Mode;
>
>   Status = gBS->LocateProtocol (
>                   &gEfiGraphicsOutputProtocolGuid,
>                   NULL,
>                   &Interface
>                   );
>   if (EFI_ERROR (Status)) {
>     return 1;
>   }
>
>   Gop = Interface;
>
>   Mode = 1;
>   for ( ; ;) {
>     Status = Gop->SetMode (Gop, Mode);
>     if (EFI_ERROR (Status)) {
>       break;
>     }
>
>     Mode = 1 - Mode;
>   }
>
>   return 1;
> }

The symptom is then that:

- only one trace message appears periodically,

- the time between adjacent messages keeps increasing -- implying that
  some list structure (containing the leaked resources) keeps growing,

- the "surface" pointer is ever different.

> 18566@1695127471.449586:displaysurface_free surface=0x7f2fcc09a7c0
> 18566@1695127471.529559:displaysurface_free surface=0x7f2fcc9dac10
> 18566@1695127471.659812:displaysurface_free surface=0x7f2fcc441dd0
> 18566@1695127471.839669:displaysurface_free surface=0x7f2fcc0363d0
> 18566@1695127472.069674:displaysurface_free surface=0x7f2fcc413a80
> 18566@1695127472.349580:displaysurface_free surface=0x7f2fcc09cd00
> 18566@1695127472.679783:displaysurface_free surface=0x7f2fcc1395f0
> 18566@1695127473.059848:displaysurface_free surface=0x7f2fcc1cae50
> 18566@1695127473.489724:displaysurface_free surface=0x7f2fcc42fc50
> 18566@1695127473.969791:displaysurface_free surface=0x7f2fcc45dcc0
> 18566@1695127474.499708:displaysurface_free surface=0x7f2fcc70b9d0
> 18566@1695127475.079769:displaysurface_free surface=0x7f2fcc82acc0
> 18566@1695127475.709941:displaysurface_free surface=0x7f2fcc369c00
> 18566@1695127476.389619:displaysurface_free surface=0x7f2fcc32b910
> 18566@1695127477.119772:displaysurface_free surface=0x7f2fcc0d5a20
> 18566@1695127477.899517:displaysurface_free surface=0x7f2fcc086c40
> 18566@1695127478.729962:displaysurface_free surface=0x7f2fccc72020
> 18566@1695127479.609839:displaysurface_free surface=0x7f2fcc185160
> 18566@1695127480.539688:displaysurface_free surface=0x7f2fcc23a7e0
> 18566@1695127481.519759:displaysurface_free surface=0x7f2fcc3ec870
> 18566@1695127482.549930:displaysurface_free surface=0x7f2fcc634960
> 18566@1695127483.629661:displaysurface_free surface=0x7f2fcc26b140
> 18566@1695127484.759987:displaysurface_free surface=0x7f2fcc321700
> 18566@1695127485.940289:displaysurface_free surface=0x7f2fccaad100

We figured this wasn't a CVE-worthy problem, as only small amounts of
memory were leaked (the framebuffer itself is mapped from guest RAM, QEMU
only allocates administrative structures), plus libvirt restricts QEMU
memory footprint anyway, thus the guest can only DoS itself.

Plug the leak, by releasing the last prepared (not yet swapped in) display
surface, if any, in the fw_cfg DMA write callback.

Regarding the "reproducer", with the fix in place, the log is flooded with
trace messages (one per fw_cfg write), *and* the trace message alternates
between just two "surface" pointer values (i.e., nothing is leaked, the
allocator flip-flops between two objects in effect).

This issue appears to date back to the introducion of ramfb (995b30179bdc,
"hw/display: add ramfb, a simple boot framebuffer living in guest ram",
2018-06-18).

Cc: Gerd Hoffmann <kraxel@redhat.com> (maintainer:ramfb)
Cc: qemu-stable@nongnu.org
Fixes: 995b30179bdc
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
---
 hw/display/ramfb.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/display/ramfb.c b/hw/display/ramfb.c
index 79b9754a5820..c2b002d53480 100644
--- a/hw/display/ramfb.c
+++ b/hw/display/ramfb.c
@@ -97,6 +97,7 @@ static void ramfb_fw_cfg_write(void *dev, off_t offset, size_t len)
 
     s->width = width;
     s->height = height;
+    qemu_free_displaysurface(s->ds);
     s->ds = surface;
 }
 
Re: [PATCH] hw/display/ramfb: plug slight guest-triggerable leak on mode setting
Posted by Laszlo Ersek 7 months, 1 week ago
On 9/19/23 15:19, Laszlo Ersek wrote:
> The fw_cfg DMA write callback in ramfb prepares a new display surface in
> QEMU; this new surface is put to use ("swapped in") upon the next display
> update. At that time, the old surface (if any) is released.
> 
> If the guest triggers the fw_cfg DMA write callback at least twice between
> two adjacent display updates, then the second callback (and further such
> callbacks) will leak the previously prepared (but not yet swapped in)
> display surface.
> 
> The issue can be shown by:
> 
> (1) starting QEMU with "-trace displaysurface_free", and
> 
> (2) running the following program in the guest UEFI shell:
> 
>> #include <Library/ShellCEntryLib.h>           // ShellAppMain()
>> #include <Library/UefiBootServicesTableLib.h> // gBS
>> #include <Protocol/GraphicsOutput.h>          // EFI_GRAPHICS_OUTPUT_PROTOCOL
>>
>> INTN
>> EFIAPI
>> ShellAppMain (
>>   IN UINTN   Argc,
>>   IN CHAR16  **Argv
>>   )
>> {
>>   EFI_STATUS                    Status;
>>   VOID                          *Interface;
>>   EFI_GRAPHICS_OUTPUT_PROTOCOL  *Gop;
>>   UINT32                        Mode;
>>
>>   Status = gBS->LocateProtocol (
>>                   &gEfiGraphicsOutputProtocolGuid,
>>                   NULL,
>>                   &Interface
>>                   );
>>   if (EFI_ERROR (Status)) {
>>     return 1;
>>   }
>>
>>   Gop = Interface;
>>
>>   Mode = 1;
>>   for ( ; ;) {
>>     Status = Gop->SetMode (Gop, Mode);
>>     if (EFI_ERROR (Status)) {
>>       break;
>>     }
>>
>>     Mode = 1 - Mode;
>>   }
>>
>>   return 1;
>> }
> 
> The symptom is then that:
> 
> - only one trace message appears periodically,
> 
> - the time between adjacent messages keeps increasing -- implying that
>   some list structure (containing the leaked resources) keeps growing,
> 
> - the "surface" pointer is ever different.
> 
>> 18566@1695127471.449586:displaysurface_free surface=0x7f2fcc09a7c0
>> 18566@1695127471.529559:displaysurface_free surface=0x7f2fcc9dac10
>> 18566@1695127471.659812:displaysurface_free surface=0x7f2fcc441dd0
>> 18566@1695127471.839669:displaysurface_free surface=0x7f2fcc0363d0
>> 18566@1695127472.069674:displaysurface_free surface=0x7f2fcc413a80
>> 18566@1695127472.349580:displaysurface_free surface=0x7f2fcc09cd00
>> 18566@1695127472.679783:displaysurface_free surface=0x7f2fcc1395f0
>> 18566@1695127473.059848:displaysurface_free surface=0x7f2fcc1cae50
>> 18566@1695127473.489724:displaysurface_free surface=0x7f2fcc42fc50
>> 18566@1695127473.969791:displaysurface_free surface=0x7f2fcc45dcc0
>> 18566@1695127474.499708:displaysurface_free surface=0x7f2fcc70b9d0
>> 18566@1695127475.079769:displaysurface_free surface=0x7f2fcc82acc0
>> 18566@1695127475.709941:displaysurface_free surface=0x7f2fcc369c00
>> 18566@1695127476.389619:displaysurface_free surface=0x7f2fcc32b910
>> 18566@1695127477.119772:displaysurface_free surface=0x7f2fcc0d5a20
>> 18566@1695127477.899517:displaysurface_free surface=0x7f2fcc086c40
>> 18566@1695127478.729962:displaysurface_free surface=0x7f2fccc72020
>> 18566@1695127479.609839:displaysurface_free surface=0x7f2fcc185160
>> 18566@1695127480.539688:displaysurface_free surface=0x7f2fcc23a7e0
>> 18566@1695127481.519759:displaysurface_free surface=0x7f2fcc3ec870
>> 18566@1695127482.549930:displaysurface_free surface=0x7f2fcc634960
>> 18566@1695127483.629661:displaysurface_free surface=0x7f2fcc26b140
>> 18566@1695127484.759987:displaysurface_free surface=0x7f2fcc321700
>> 18566@1695127485.940289:displaysurface_free surface=0x7f2fccaad100
> 
> We figured this wasn't a CVE-worthy problem, as only small amounts of
> memory were leaked (the framebuffer itself is mapped from guest RAM, QEMU
> only allocates administrative structures), plus libvirt restricts QEMU
> memory footprint anyway, thus the guest can only DoS itself.
> 
> Plug the leak, by releasing the last prepared (not yet swapped in) display
> surface, if any, in the fw_cfg DMA write callback.
> 
> Regarding the "reproducer", with the fix in place, the log is flooded with
> trace messages (one per fw_cfg write), *and* the trace message alternates
> between just two "surface" pointer values (i.e., nothing is leaked, the
> allocator flip-flops between two objects in effect).
> 
> This issue appears to date back to the introducion of ramfb (995b30179bdc,
> "hw/display: add ramfb, a simple boot framebuffer living in guest ram",
> 2018-06-18).
> 
> Cc: Gerd Hoffmann <kraxel@redhat.com> (maintainer:ramfb)
> Cc: qemu-stable@nongnu.org
> Fixes: 995b30179bdc
> Signed-off-by: Laszlo Ersek <lersek@redhat.com>
> ---
>  hw/display/ramfb.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/hw/display/ramfb.c b/hw/display/ramfb.c
> index 79b9754a5820..c2b002d53480 100644
> --- a/hw/display/ramfb.c
> +++ b/hw/display/ramfb.c
> @@ -97,6 +97,7 @@ static void ramfb_fw_cfg_write(void *dev, off_t offset, size_t len)
>  
>      s->width = width;
>      s->height = height;
> +    qemu_free_displaysurface(s->ds);
>      s->ds = surface;
>  }
>  

Ping.

Laszlo
Re: [PATCH] hw/display/ramfb: plug slight guest-triggerable leak on mode setting
Posted by Gerd Hoffmann 7 months ago
On Wed, Sep 27, 2023 at 05:45:25PM +0200, Laszlo Ersek wrote:
> On 9/19/23 15:19, Laszlo Ersek wrote:
> > The fw_cfg DMA write callback in ramfb prepares a new display surface in
> > QEMU; this new surface is put to use ("swapped in") upon the next display
> > update. At that time, the old surface (if any) is released.
> > 
> > If the guest triggers the fw_cfg DMA write callback at least twice between
> > two adjacent display updates, then the second callback (and further such
> > callbacks) will leak the previously prepared (but not yet swapped in)
> > display surface.

[ ... ]

> >      s->width = width;
> >      s->height = height;
> > +    qemu_free_displaysurface(s->ds);
> >      s->ds = surface;
> >  }
> >  
> 
> Ping.

Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>

take care,
  Gerd
Re: [PATCH] hw/display/ramfb: plug slight guest-triggerable leak on mode setting
Posted by Marc-André Lureau 7 months ago
Hi

On Wed, Sep 27, 2023 at 7:46 PM Laszlo Ersek <lersek@redhat.com> wrote:
>
> On 9/19/23 15:19, Laszlo Ersek wrote:
> > The fw_cfg DMA write callback in ramfb prepares a new display surface in
> > QEMU; this new surface is put to use ("swapped in") upon the next display
> > update. At that time, the old surface (if any) is released.
> >
> > If the guest triggers the fw_cfg DMA write callback at least twice between
> > two adjacent display updates, then the second callback (and further such
> > callbacks) will leak the previously prepared (but not yet swapped in)
> > display surface.
> >
> > The issue can be shown by:
> >
> > (1) starting QEMU with "-trace displaysurface_free", and
> >
> > (2) running the following program in the guest UEFI shell:
> >
> >> #include <Library/ShellCEntryLib.h>           // ShellAppMain()
> >> #include <Library/UefiBootServicesTableLib.h> // gBS
> >> #include <Protocol/GraphicsOutput.h>          // EFI_GRAPHICS_OUTPUT_PROTOCOL
> >>
> >> INTN
> >> EFIAPI
> >> ShellAppMain (
> >>   IN UINTN   Argc,
> >>   IN CHAR16  **Argv
> >>   )
> >> {
> >>   EFI_STATUS                    Status;
> >>   VOID                          *Interface;
> >>   EFI_GRAPHICS_OUTPUT_PROTOCOL  *Gop;
> >>   UINT32                        Mode;
> >>
> >>   Status = gBS->LocateProtocol (
> >>                   &gEfiGraphicsOutputProtocolGuid,
> >>                   NULL,
> >>                   &Interface
> >>                   );
> >>   if (EFI_ERROR (Status)) {
> >>     return 1;
> >>   }
> >>
> >>   Gop = Interface;
> >>
> >>   Mode = 1;
> >>   for ( ; ;) {
> >>     Status = Gop->SetMode (Gop, Mode);
> >>     if (EFI_ERROR (Status)) {
> >>       break;
> >>     }
> >>
> >>     Mode = 1 - Mode;
> >>   }
> >>
> >>   return 1;
> >> }
> >
> > The symptom is then that:
> >
> > - only one trace message appears periodically,
> >
> > - the time between adjacent messages keeps increasing -- implying that
> >   some list structure (containing the leaked resources) keeps growing,
> >
> > - the "surface" pointer is ever different.
> >
> >> 18566@1695127471.449586:displaysurface_free surface=0x7f2fcc09a7c0
> >> 18566@1695127471.529559:displaysurface_free surface=0x7f2fcc9dac10
> >> 18566@1695127471.659812:displaysurface_free surface=0x7f2fcc441dd0
> >> 18566@1695127471.839669:displaysurface_free surface=0x7f2fcc0363d0
> >> 18566@1695127472.069674:displaysurface_free surface=0x7f2fcc413a80
> >> 18566@1695127472.349580:displaysurface_free surface=0x7f2fcc09cd00
> >> 18566@1695127472.679783:displaysurface_free surface=0x7f2fcc1395f0
> >> 18566@1695127473.059848:displaysurface_free surface=0x7f2fcc1cae50
> >> 18566@1695127473.489724:displaysurface_free surface=0x7f2fcc42fc50
> >> 18566@1695127473.969791:displaysurface_free surface=0x7f2fcc45dcc0
> >> 18566@1695127474.499708:displaysurface_free surface=0x7f2fcc70b9d0
> >> 18566@1695127475.079769:displaysurface_free surface=0x7f2fcc82acc0
> >> 18566@1695127475.709941:displaysurface_free surface=0x7f2fcc369c00
> >> 18566@1695127476.389619:displaysurface_free surface=0x7f2fcc32b910
> >> 18566@1695127477.119772:displaysurface_free surface=0x7f2fcc0d5a20
> >> 18566@1695127477.899517:displaysurface_free surface=0x7f2fcc086c40
> >> 18566@1695127478.729962:displaysurface_free surface=0x7f2fccc72020
> >> 18566@1695127479.609839:displaysurface_free surface=0x7f2fcc185160
> >> 18566@1695127480.539688:displaysurface_free surface=0x7f2fcc23a7e0
> >> 18566@1695127481.519759:displaysurface_free surface=0x7f2fcc3ec870
> >> 18566@1695127482.549930:displaysurface_free surface=0x7f2fcc634960
> >> 18566@1695127483.629661:displaysurface_free surface=0x7f2fcc26b140
> >> 18566@1695127484.759987:displaysurface_free surface=0x7f2fcc321700
> >> 18566@1695127485.940289:displaysurface_free surface=0x7f2fccaad100
> >
> > We figured this wasn't a CVE-worthy problem, as only small amounts of
> > memory were leaked (the framebuffer itself is mapped from guest RAM, QEMU
> > only allocates administrative structures), plus libvirt restricts QEMU
> > memory footprint anyway, thus the guest can only DoS itself.
> >
> > Plug the leak, by releasing the last prepared (not yet swapped in) display
> > surface, if any, in the fw_cfg DMA write callback.
> >
> > Regarding the "reproducer", with the fix in place, the log is flooded with
> > trace messages (one per fw_cfg write), *and* the trace message alternates
> > between just two "surface" pointer values (i.e., nothing is leaked, the
> > allocator flip-flops between two objects in effect).
> >
> > This issue appears to date back to the introducion of ramfb (995b30179bdc,
> > "hw/display: add ramfb, a simple boot framebuffer living in guest ram",
> > 2018-06-18).
> >
> > Cc: Gerd Hoffmann <kraxel@redhat.com> (maintainer:ramfb)
> > Cc: qemu-stable@nongnu.org
> > Fixes: 995b30179bdc
> > Signed-off-by: Laszlo Ersek <lersek@redhat.com>
> > ---
> >  hw/display/ramfb.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/hw/display/ramfb.c b/hw/display/ramfb.c
> > index 79b9754a5820..c2b002d53480 100644
> > --- a/hw/display/ramfb.c
> > +++ b/hw/display/ramfb.c
> > @@ -97,6 +97,7 @@ static void ramfb_fw_cfg_write(void *dev, off_t offset, size_t len)
> >
> >      s->width = width;
> >      s->height = height;
> > +    qemu_free_displaysurface(s->ds);
> >      s->ds = surface;
> >  }

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>

Incidentally I found the same issue:
https://patchew.org/QEMU/20230920082634.3349487-1-marcandre.lureau@redhat.com/


fwiw, my migration support patch is still unreviewed:
https://patchew.org/QEMU/20230920082651.3349712-1-marcandre.lureau@redhat.com/

-- 
Marc-André Lureau
Re: [PATCH] hw/display/ramfb: plug slight guest-triggerable leak on mode setting
Posted by Laszlo Ersek 7 months ago
On 9/29/23 13:17, Marc-André Lureau wrote:
> Hi
> 
> On Wed, Sep 27, 2023 at 7:46 PM Laszlo Ersek <lersek@redhat.com> wrote:
>>
>> On 9/19/23 15:19, Laszlo Ersek wrote:
>>> The fw_cfg DMA write callback in ramfb prepares a new display surface in
>>> QEMU; this new surface is put to use ("swapped in") upon the next display
>>> update. At that time, the old surface (if any) is released.
>>>
>>> If the guest triggers the fw_cfg DMA write callback at least twice between
>>> two adjacent display updates, then the second callback (and further such
>>> callbacks) will leak the previously prepared (but not yet swapped in)
>>> display surface.
>>>
>>> The issue can be shown by:
>>>
>>> (1) starting QEMU with "-trace displaysurface_free", and
>>>
>>> (2) running the following program in the guest UEFI shell:
>>>
>>>> #include <Library/ShellCEntryLib.h>           // ShellAppMain()
>>>> #include <Library/UefiBootServicesTableLib.h> // gBS
>>>> #include <Protocol/GraphicsOutput.h>          // EFI_GRAPHICS_OUTPUT_PROTOCOL
>>>>
>>>> INTN
>>>> EFIAPI
>>>> ShellAppMain (
>>>>   IN UINTN   Argc,
>>>>   IN CHAR16  **Argv
>>>>   )
>>>> {
>>>>   EFI_STATUS                    Status;
>>>>   VOID                          *Interface;
>>>>   EFI_GRAPHICS_OUTPUT_PROTOCOL  *Gop;
>>>>   UINT32                        Mode;
>>>>
>>>>   Status = gBS->LocateProtocol (
>>>>                   &gEfiGraphicsOutputProtocolGuid,
>>>>                   NULL,
>>>>                   &Interface
>>>>                   );
>>>>   if (EFI_ERROR (Status)) {
>>>>     return 1;
>>>>   }
>>>>
>>>>   Gop = Interface;
>>>>
>>>>   Mode = 1;
>>>>   for ( ; ;) {
>>>>     Status = Gop->SetMode (Gop, Mode);
>>>>     if (EFI_ERROR (Status)) {
>>>>       break;
>>>>     }
>>>>
>>>>     Mode = 1 - Mode;
>>>>   }
>>>>
>>>>   return 1;
>>>> }
>>>
>>> The symptom is then that:
>>>
>>> - only one trace message appears periodically,
>>>
>>> - the time between adjacent messages keeps increasing -- implying that
>>>   some list structure (containing the leaked resources) keeps growing,
>>>
>>> - the "surface" pointer is ever different.
>>>
>>>> 18566@1695127471.449586:displaysurface_free surface=0x7f2fcc09a7c0
>>>> 18566@1695127471.529559:displaysurface_free surface=0x7f2fcc9dac10
>>>> 18566@1695127471.659812:displaysurface_free surface=0x7f2fcc441dd0
>>>> 18566@1695127471.839669:displaysurface_free surface=0x7f2fcc0363d0
>>>> 18566@1695127472.069674:displaysurface_free surface=0x7f2fcc413a80
>>>> 18566@1695127472.349580:displaysurface_free surface=0x7f2fcc09cd00
>>>> 18566@1695127472.679783:displaysurface_free surface=0x7f2fcc1395f0
>>>> 18566@1695127473.059848:displaysurface_free surface=0x7f2fcc1cae50
>>>> 18566@1695127473.489724:displaysurface_free surface=0x7f2fcc42fc50
>>>> 18566@1695127473.969791:displaysurface_free surface=0x7f2fcc45dcc0
>>>> 18566@1695127474.499708:displaysurface_free surface=0x7f2fcc70b9d0
>>>> 18566@1695127475.079769:displaysurface_free surface=0x7f2fcc82acc0
>>>> 18566@1695127475.709941:displaysurface_free surface=0x7f2fcc369c00
>>>> 18566@1695127476.389619:displaysurface_free surface=0x7f2fcc32b910
>>>> 18566@1695127477.119772:displaysurface_free surface=0x7f2fcc0d5a20
>>>> 18566@1695127477.899517:displaysurface_free surface=0x7f2fcc086c40
>>>> 18566@1695127478.729962:displaysurface_free surface=0x7f2fccc72020
>>>> 18566@1695127479.609839:displaysurface_free surface=0x7f2fcc185160
>>>> 18566@1695127480.539688:displaysurface_free surface=0x7f2fcc23a7e0
>>>> 18566@1695127481.519759:displaysurface_free surface=0x7f2fcc3ec870
>>>> 18566@1695127482.549930:displaysurface_free surface=0x7f2fcc634960
>>>> 18566@1695127483.629661:displaysurface_free surface=0x7f2fcc26b140
>>>> 18566@1695127484.759987:displaysurface_free surface=0x7f2fcc321700
>>>> 18566@1695127485.940289:displaysurface_free surface=0x7f2fccaad100
>>>
>>> We figured this wasn't a CVE-worthy problem, as only small amounts of
>>> memory were leaked (the framebuffer itself is mapped from guest RAM, QEMU
>>> only allocates administrative structures), plus libvirt restricts QEMU
>>> memory footprint anyway, thus the guest can only DoS itself.
>>>
>>> Plug the leak, by releasing the last prepared (not yet swapped in) display
>>> surface, if any, in the fw_cfg DMA write callback.
>>>
>>> Regarding the "reproducer", with the fix in place, the log is flooded with
>>> trace messages (one per fw_cfg write), *and* the trace message alternates
>>> between just two "surface" pointer values (i.e., nothing is leaked, the
>>> allocator flip-flops between two objects in effect).
>>>
>>> This issue appears to date back to the introducion of ramfb (995b30179bdc,
>>> "hw/display: add ramfb, a simple boot framebuffer living in guest ram",
>>> 2018-06-18).
>>>
>>> Cc: Gerd Hoffmann <kraxel@redhat.com> (maintainer:ramfb)
>>> Cc: qemu-stable@nongnu.org
>>> Fixes: 995b30179bdc
>>> Signed-off-by: Laszlo Ersek <lersek@redhat.com>
>>> ---
>>>  hw/display/ramfb.c | 1 +
>>>  1 file changed, 1 insertion(+)
>>>
>>> diff --git a/hw/display/ramfb.c b/hw/display/ramfb.c
>>> index 79b9754a5820..c2b002d53480 100644
>>> --- a/hw/display/ramfb.c
>>> +++ b/hw/display/ramfb.c
>>> @@ -97,6 +97,7 @@ static void ramfb_fw_cfg_write(void *dev, off_t offset, size_t len)
>>>
>>>      s->width = width;
>>>      s->height = height;
>>> +    qemu_free_displaysurface(s->ds);
>>>      s->ds = surface;
>>>  }
> 
> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> 
> Incidentally I found the same issue:
> https://patchew.org/QEMU/20230920082634.3349487-1-marcandre.lureau@redhat.com/

Which patch is better?

I certainly didn't think of g_clear_pointer(); is that more idiomatic?

> 
> 
> fwiw, my migration support patch is still unreviewed:
> https://patchew.org/QEMU/20230920082651.3349712-1-marcandre.lureau@redhat.com/
> 

I don't have a copy of that patch (not subscribed, sorry...), but how
simply you did it surprises me. I did expect to simulate an fw_cfg write
in post_load, but I thought we'd approach the device (for the sake of
including it in the migration stream) from the higher level device's
vmstate descriptors (dc->vmsd) that set up / depend on ramfb in the
first place. I didn't know that raw vmstate_register() was still accepted.

If it is, then, for that patch (with Gerd's comment addressed):

Acked-by: Laszlo Ersek <lersek@redhat.com>

BTW: can you please assign
<https://bugzilla.redhat.com/show_bug.cgi?id=1859424> to yourself and
link your patch into it? The reason we ended up duplicating work here is
that noone took RHBZ 1859424 before.

... Well, the ticket is RHEL-7478 in JIRA now, in fact. :/

Thanks!
Laszlo


Re: [PATCH] hw/display/ramfb: plug slight guest-triggerable leak on mode setting
Posted by Laszlo Ersek 7 months ago
On 10/1/23 00:14, Laszlo Ersek wrote:
> On 9/29/23 13:17, Marc-André Lureau wrote:
>> Hi
>>
>> On Wed, Sep 27, 2023 at 7:46 PM Laszlo Ersek <lersek@redhat.com> wrote:
>>>
>>> On 9/19/23 15:19, Laszlo Ersek wrote:
>>>> The fw_cfg DMA write callback in ramfb prepares a new display surface in
>>>> QEMU; this new surface is put to use ("swapped in") upon the next display
>>>> update. At that time, the old surface (if any) is released.
>>>>
>>>> If the guest triggers the fw_cfg DMA write callback at least twice between
>>>> two adjacent display updates, then the second callback (and further such
>>>> callbacks) will leak the previously prepared (but not yet swapped in)
>>>> display surface.
>>>>
>>>> The issue can be shown by:
>>>>
>>>> (1) starting QEMU with "-trace displaysurface_free", and
>>>>
>>>> (2) running the following program in the guest UEFI shell:
>>>>
>>>>> #include <Library/ShellCEntryLib.h>           // ShellAppMain()
>>>>> #include <Library/UefiBootServicesTableLib.h> // gBS
>>>>> #include <Protocol/GraphicsOutput.h>          // EFI_GRAPHICS_OUTPUT_PROTOCOL
>>>>>
>>>>> INTN
>>>>> EFIAPI
>>>>> ShellAppMain (
>>>>>   IN UINTN   Argc,
>>>>>   IN CHAR16  **Argv
>>>>>   )
>>>>> {
>>>>>   EFI_STATUS                    Status;
>>>>>   VOID                          *Interface;
>>>>>   EFI_GRAPHICS_OUTPUT_PROTOCOL  *Gop;
>>>>>   UINT32                        Mode;
>>>>>
>>>>>   Status = gBS->LocateProtocol (
>>>>>                   &gEfiGraphicsOutputProtocolGuid,
>>>>>                   NULL,
>>>>>                   &Interface
>>>>>                   );
>>>>>   if (EFI_ERROR (Status)) {
>>>>>     return 1;
>>>>>   }
>>>>>
>>>>>   Gop = Interface;
>>>>>
>>>>>   Mode = 1;
>>>>>   for ( ; ;) {
>>>>>     Status = Gop->SetMode (Gop, Mode);
>>>>>     if (EFI_ERROR (Status)) {
>>>>>       break;
>>>>>     }
>>>>>
>>>>>     Mode = 1 - Mode;
>>>>>   }
>>>>>
>>>>>   return 1;
>>>>> }
>>>>
>>>> The symptom is then that:
>>>>
>>>> - only one trace message appears periodically,
>>>>
>>>> - the time between adjacent messages keeps increasing -- implying that
>>>>   some list structure (containing the leaked resources) keeps growing,
>>>>
>>>> - the "surface" pointer is ever different.
>>>>
>>>>> 18566@1695127471.449586:displaysurface_free surface=0x7f2fcc09a7c0
>>>>> 18566@1695127471.529559:displaysurface_free surface=0x7f2fcc9dac10
>>>>> 18566@1695127471.659812:displaysurface_free surface=0x7f2fcc441dd0
>>>>> 18566@1695127471.839669:displaysurface_free surface=0x7f2fcc0363d0
>>>>> 18566@1695127472.069674:displaysurface_free surface=0x7f2fcc413a80
>>>>> 18566@1695127472.349580:displaysurface_free surface=0x7f2fcc09cd00
>>>>> 18566@1695127472.679783:displaysurface_free surface=0x7f2fcc1395f0
>>>>> 18566@1695127473.059848:displaysurface_free surface=0x7f2fcc1cae50
>>>>> 18566@1695127473.489724:displaysurface_free surface=0x7f2fcc42fc50
>>>>> 18566@1695127473.969791:displaysurface_free surface=0x7f2fcc45dcc0
>>>>> 18566@1695127474.499708:displaysurface_free surface=0x7f2fcc70b9d0
>>>>> 18566@1695127475.079769:displaysurface_free surface=0x7f2fcc82acc0
>>>>> 18566@1695127475.709941:displaysurface_free surface=0x7f2fcc369c00
>>>>> 18566@1695127476.389619:displaysurface_free surface=0x7f2fcc32b910
>>>>> 18566@1695127477.119772:displaysurface_free surface=0x7f2fcc0d5a20
>>>>> 18566@1695127477.899517:displaysurface_free surface=0x7f2fcc086c40
>>>>> 18566@1695127478.729962:displaysurface_free surface=0x7f2fccc72020
>>>>> 18566@1695127479.609839:displaysurface_free surface=0x7f2fcc185160
>>>>> 18566@1695127480.539688:displaysurface_free surface=0x7f2fcc23a7e0
>>>>> 18566@1695127481.519759:displaysurface_free surface=0x7f2fcc3ec870
>>>>> 18566@1695127482.549930:displaysurface_free surface=0x7f2fcc634960
>>>>> 18566@1695127483.629661:displaysurface_free surface=0x7f2fcc26b140
>>>>> 18566@1695127484.759987:displaysurface_free surface=0x7f2fcc321700
>>>>> 18566@1695127485.940289:displaysurface_free surface=0x7f2fccaad100
>>>>
>>>> We figured this wasn't a CVE-worthy problem, as only small amounts of
>>>> memory were leaked (the framebuffer itself is mapped from guest RAM, QEMU
>>>> only allocates administrative structures), plus libvirt restricts QEMU
>>>> memory footprint anyway, thus the guest can only DoS itself.
>>>>
>>>> Plug the leak, by releasing the last prepared (not yet swapped in) display
>>>> surface, if any, in the fw_cfg DMA write callback.
>>>>
>>>> Regarding the "reproducer", with the fix in place, the log is flooded with
>>>> trace messages (one per fw_cfg write), *and* the trace message alternates
>>>> between just two "surface" pointer values (i.e., nothing is leaked, the
>>>> allocator flip-flops between two objects in effect).
>>>>
>>>> This issue appears to date back to the introducion of ramfb (995b30179bdc,
>>>> "hw/display: add ramfb, a simple boot framebuffer living in guest ram",
>>>> 2018-06-18).
>>>>
>>>> Cc: Gerd Hoffmann <kraxel@redhat.com> (maintainer:ramfb)
>>>> Cc: qemu-stable@nongnu.org
>>>> Fixes: 995b30179bdc
>>>> Signed-off-by: Laszlo Ersek <lersek@redhat.com>
>>>> ---
>>>>  hw/display/ramfb.c | 1 +
>>>>  1 file changed, 1 insertion(+)
>>>>
>>>> diff --git a/hw/display/ramfb.c b/hw/display/ramfb.c
>>>> index 79b9754a5820..c2b002d53480 100644
>>>> --- a/hw/display/ramfb.c
>>>> +++ b/hw/display/ramfb.c
>>>> @@ -97,6 +97,7 @@ static void ramfb_fw_cfg_write(void *dev, off_t offset, size_t len)
>>>>
>>>>      s->width = width;
>>>>      s->height = height;
>>>> +    qemu_free_displaysurface(s->ds);
>>>>      s->ds = surface;
>>>>  }
>>
>> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
>>
>> Incidentally I found the same issue:
>> https://patchew.org/QEMU/20230920082634.3349487-1-marcandre.lureau@redhat.com/
> 
> Which patch is better?
> 
> I certainly didn't think of g_clear_pointer(); is that more idiomatic?

FWIW, I'm happy to use g_clear_pointer() if that's deemed more
idiomatic, but I'd appreciate if my *commit message* (with the UEFI
reproducer) could land in the git history :)

Laszlo

> 
>>
>>
>> fwiw, my migration support patch is still unreviewed:
>> https://patchew.org/QEMU/20230920082651.3349712-1-marcandre.lureau@redhat.com/
>>
> 
> I don't have a copy of that patch (not subscribed, sorry...), but how
> simply you did it surprises me. I did expect to simulate an fw_cfg write
> in post_load, but I thought we'd approach the device (for the sake of
> including it in the migration stream) from the higher level device's
> vmstate descriptors (dc->vmsd) that set up / depend on ramfb in the
> first place. I didn't know that raw vmstate_register() was still accepted.
> 
> If it is, then, for that patch (with Gerd's comment addressed):
> 
> Acked-by: Laszlo Ersek <lersek@redhat.com>
> 
> BTW: can you please assign
> <https://bugzilla.redhat.com/show_bug.cgi?id=1859424> to yourself and
> link your patch into it? The reason we ended up duplicating work here is
> that noone took RHBZ 1859424 before.
> 
> ... Well, the ticket is RHEL-7478 in JIRA now, in fact. :/
> 
> Thanks!
> Laszlo


Re: [PATCH] hw/display/ramfb: plug slight guest-triggerable leak on mode setting
Posted by Laszlo Ersek 7 months ago
On 10/1/23 00:14, Laszlo Ersek wrote:
> On 9/29/23 13:17, Marc-André Lureau wrote:
>> Hi
>>
>> On Wed, Sep 27, 2023 at 7:46 PM Laszlo Ersek <lersek@redhat.com> wrote:
>>>
>>> On 9/19/23 15:19, Laszlo Ersek wrote:
>>>> The fw_cfg DMA write callback in ramfb prepares a new display surface in
>>>> QEMU; this new surface is put to use ("swapped in") upon the next display
>>>> update. At that time, the old surface (if any) is released.
>>>>
>>>> If the guest triggers the fw_cfg DMA write callback at least twice between
>>>> two adjacent display updates, then the second callback (and further such
>>>> callbacks) will leak the previously prepared (but not yet swapped in)
>>>> display surface.
>>>>
>>>> The issue can be shown by:
>>>>
>>>> (1) starting QEMU with "-trace displaysurface_free", and
>>>>
>>>> (2) running the following program in the guest UEFI shell:
>>>>
>>>>> #include <Library/ShellCEntryLib.h>           // ShellAppMain()
>>>>> #include <Library/UefiBootServicesTableLib.h> // gBS
>>>>> #include <Protocol/GraphicsOutput.h>          // EFI_GRAPHICS_OUTPUT_PROTOCOL
>>>>>
>>>>> INTN
>>>>> EFIAPI
>>>>> ShellAppMain (
>>>>>   IN UINTN   Argc,
>>>>>   IN CHAR16  **Argv
>>>>>   )
>>>>> {
>>>>>   EFI_STATUS                    Status;
>>>>>   VOID                          *Interface;
>>>>>   EFI_GRAPHICS_OUTPUT_PROTOCOL  *Gop;
>>>>>   UINT32                        Mode;
>>>>>
>>>>>   Status = gBS->LocateProtocol (
>>>>>                   &gEfiGraphicsOutputProtocolGuid,
>>>>>                   NULL,
>>>>>                   &Interface
>>>>>                   );
>>>>>   if (EFI_ERROR (Status)) {
>>>>>     return 1;
>>>>>   }
>>>>>
>>>>>   Gop = Interface;
>>>>>
>>>>>   Mode = 1;
>>>>>   for ( ; ;) {
>>>>>     Status = Gop->SetMode (Gop, Mode);
>>>>>     if (EFI_ERROR (Status)) {
>>>>>       break;
>>>>>     }
>>>>>
>>>>>     Mode = 1 - Mode;
>>>>>   }
>>>>>
>>>>>   return 1;
>>>>> }
>>>>
>>>> The symptom is then that:
>>>>
>>>> - only one trace message appears periodically,
>>>>
>>>> - the time between adjacent messages keeps increasing -- implying that
>>>>   some list structure (containing the leaked resources) keeps growing,
>>>>
>>>> - the "surface" pointer is ever different.
>>>>
>>>>> 18566@1695127471.449586:displaysurface_free surface=0x7f2fcc09a7c0
>>>>> 18566@1695127471.529559:displaysurface_free surface=0x7f2fcc9dac10
>>>>> 18566@1695127471.659812:displaysurface_free surface=0x7f2fcc441dd0
>>>>> 18566@1695127471.839669:displaysurface_free surface=0x7f2fcc0363d0
>>>>> 18566@1695127472.069674:displaysurface_free surface=0x7f2fcc413a80
>>>>> 18566@1695127472.349580:displaysurface_free surface=0x7f2fcc09cd00
>>>>> 18566@1695127472.679783:displaysurface_free surface=0x7f2fcc1395f0
>>>>> 18566@1695127473.059848:displaysurface_free surface=0x7f2fcc1cae50
>>>>> 18566@1695127473.489724:displaysurface_free surface=0x7f2fcc42fc50
>>>>> 18566@1695127473.969791:displaysurface_free surface=0x7f2fcc45dcc0
>>>>> 18566@1695127474.499708:displaysurface_free surface=0x7f2fcc70b9d0
>>>>> 18566@1695127475.079769:displaysurface_free surface=0x7f2fcc82acc0
>>>>> 18566@1695127475.709941:displaysurface_free surface=0x7f2fcc369c00
>>>>> 18566@1695127476.389619:displaysurface_free surface=0x7f2fcc32b910
>>>>> 18566@1695127477.119772:displaysurface_free surface=0x7f2fcc0d5a20
>>>>> 18566@1695127477.899517:displaysurface_free surface=0x7f2fcc086c40
>>>>> 18566@1695127478.729962:displaysurface_free surface=0x7f2fccc72020
>>>>> 18566@1695127479.609839:displaysurface_free surface=0x7f2fcc185160
>>>>> 18566@1695127480.539688:displaysurface_free surface=0x7f2fcc23a7e0
>>>>> 18566@1695127481.519759:displaysurface_free surface=0x7f2fcc3ec870
>>>>> 18566@1695127482.549930:displaysurface_free surface=0x7f2fcc634960
>>>>> 18566@1695127483.629661:displaysurface_free surface=0x7f2fcc26b140
>>>>> 18566@1695127484.759987:displaysurface_free surface=0x7f2fcc321700
>>>>> 18566@1695127485.940289:displaysurface_free surface=0x7f2fccaad100
>>>>
>>>> We figured this wasn't a CVE-worthy problem, as only small amounts of
>>>> memory were leaked (the framebuffer itself is mapped from guest RAM, QEMU
>>>> only allocates administrative structures), plus libvirt restricts QEMU
>>>> memory footprint anyway, thus the guest can only DoS itself.
>>>>
>>>> Plug the leak, by releasing the last prepared (not yet swapped in) display
>>>> surface, if any, in the fw_cfg DMA write callback.
>>>>
>>>> Regarding the "reproducer", with the fix in place, the log is flooded with
>>>> trace messages (one per fw_cfg write), *and* the trace message alternates
>>>> between just two "surface" pointer values (i.e., nothing is leaked, the
>>>> allocator flip-flops between two objects in effect).
>>>>
>>>> This issue appears to date back to the introducion of ramfb (995b30179bdc,
>>>> "hw/display: add ramfb, a simple boot framebuffer living in guest ram",
>>>> 2018-06-18).
>>>>
>>>> Cc: Gerd Hoffmann <kraxel@redhat.com> (maintainer:ramfb)
>>>> Cc: qemu-stable@nongnu.org
>>>> Fixes: 995b30179bdc
>>>> Signed-off-by: Laszlo Ersek <lersek@redhat.com>
>>>> ---
>>>>  hw/display/ramfb.c | 1 +
>>>>  1 file changed, 1 insertion(+)
>>>>
>>>> diff --git a/hw/display/ramfb.c b/hw/display/ramfb.c
>>>> index 79b9754a5820..c2b002d53480 100644
>>>> --- a/hw/display/ramfb.c
>>>> +++ b/hw/display/ramfb.c
>>>> @@ -97,6 +97,7 @@ static void ramfb_fw_cfg_write(void *dev, off_t offset, size_t len)
>>>>
>>>>      s->width = width;
>>>>      s->height = height;
>>>> +    qemu_free_displaysurface(s->ds);
>>>>      s->ds = surface;
>>>>  }
>>
>> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
>>
>> Incidentally I found the same issue:
>> https://patchew.org/QEMU/20230920082634.3349487-1-marcandre.lureau@redhat.com/
> 
> Which patch is better?
> 
> I certainly didn't think of g_clear_pointer(); is that more idiomatic?
> 
>>
>>
>> fwiw, my migration support patch is still unreviewed:
>> https://patchew.org/QEMU/20230920082651.3349712-1-marcandre.lureau@redhat.com/
>>
> 
> I don't have a copy of that patch (not subscribed, sorry...), but how
> simply you did it surprises me. I did expect to simulate an fw_cfg write
> in post_load, but I thought we'd approach the device (for the sake of
> including it in the migration stream) from the higher level device's
> vmstate descriptors (dc->vmsd) that set up / depend on ramfb in the
> first place. I didn't know that raw vmstate_register() was still accepted.
> 
> If it is, then, for that patch (with Gerd's comment addressed):
> 
> Acked-by: Laszlo Ersek <lersek@redhat.com>

I think I may have found one issue with that patch.

The fields that we save into the migration stream are integer members of
the RAMFBCfg structure that lives directly in the fw_cfg file. The ramfb
device specifies those fields for the guest as big endian. This means
that when ramfb_fw_cfg_write() runs, triggered by the guest, then on big
endian hosts, be32_to_cpu() and be64_to_cpu() will be no-ops, as the
integer representation inside the fw_cfg file will match the host CPU at
once. And on little endian hosts, these functions will byte swap. In
both cases, ramfb_create_display_surface() will have to be called with
identical host-side integers. This means that *before* the be32_to_cpu()
and be64_to_cpu() calls, the host side *values* read out from the fw_cfg
file members *differ* between big endian and little endian hosts.

And the problem is that we write precisely those values into the
migration stream, via "vmstate_ramfb_cfg". The migration stream
represents integers in big endian regardless of host endianness, but the
question is the *values* that we encode in BE for the stream. And the
values (from fw_cg) will differ between little endian and big endian hosts.

Thus, I think we should just use

  VMSTATE_BUFFER(cfg, RAMFBState)

in "vmstate_ramfb", and remove "vmstate_ramfb_cfg". For migration
purposes, we should treat the fw_cfg file as an opaque blob.

Laszlo

> 
> BTW: can you please assign
> <https://bugzilla.redhat.com/show_bug.cgi?id=1859424> to yourself and
> link your patch into it? The reason we ended up duplicating work here is
> that noone took RHBZ 1859424 before.
> 
> ... Well, the ticket is RHEL-7478 in JIRA now, in fact. :/
> 
> Thanks!
> Laszlo


Re: [PATCH] hw/display/ramfb: plug slight guest-triggerable leak on mode setting
Posted by Marc-André Lureau 7 months ago
Hi Laszlo

On Sun, Oct 1, 2023 at 4:20 AM Laszlo Ersek <lersek@redhat.com> wrote:
>
> On 10/1/23 00:14, Laszlo Ersek wrote:
> > On 9/29/23 13:17, Marc-André Lureau wrote:
[..]
> >> fwiw, my migration support patch is still unreviewed:
> >> https://patchew.org/QEMU/20230920082651.3349712-1-marcandre.lureau@redhat.com/
> >>
> >
> > I don't have a copy of that patch (not subscribed, sorry...), but how
> > simply you did it surprises me. I did expect to simulate an fw_cfg write
> > in post_load, but I thought we'd approach the device (for the sake of
> > including it in the migration stream) from the higher level device's
> > vmstate descriptors (dc->vmsd) that set up / depend on ramfb in the
> > first place. I didn't know that raw vmstate_register() was still accepted.
> >
> > If it is, then, for that patch (with Gerd's comment addressed):
> >
> > Acked-by: Laszlo Ersek <lersek@redhat.com>
>
> I think I may have found one issue with that patch.
>
> The fields that we save into the migration stream are integer members of
> the RAMFBCfg structure that lives directly in the fw_cfg file. The ramfb
> device specifies those fields for the guest as big endian. This means
> that when ramfb_fw_cfg_write() runs, triggered by the guest, then on big
> endian hosts, be32_to_cpu() and be64_to_cpu() will be no-ops, as the
> integer representation inside the fw_cfg file will match the host CPU at
> once. And on little endian hosts, these functions will byte swap. In
> both cases, ramfb_create_display_surface() will have to be called with
> identical host-side integers. This means that *before* the be32_to_cpu()
> and be64_to_cpu() calls, the host side *values* read out from the fw_cfg
> file members *differ* between big endian and little endian hosts.
>
> And the problem is that we write precisely those values into the
> migration stream, via "vmstate_ramfb_cfg". The migration stream
> represents integers in big endian regardless of host endianness, but the
> question is the *values* that we encode in BE for the stream. And the
> values (from fw_cg) will differ between little endian and big endian hosts.
>
> Thus, I think we should just use
>
>   VMSTATE_BUFFER(cfg, RAMFBState)
>
> in "vmstate_ramfb", and remove "vmstate_ramfb_cfg". For migration
> purposes, we should treat the fw_cfg file as an opaque blob.

I think I see your point. Using VMSTATE_BUFFER like that doesn't work
though. It's also more error-prone if fields are added in the struct,
imho.

However, we could simply have a post-load to convert the values to BE.
I wonder if new macros couldn't be introduced too.

> >
> > BTW: can you please assign
> > <https://bugzilla.redhat.com/show_bug.cgi?id=1859424> to yourself and
> > link your patch into it? The reason we ended up duplicating work here is
> > that noone took RHBZ 1859424 before.

I thought I did that.

https://bugzilla.redhat.com/show_bug.cgi?id=1859424#c17

> >
> > ... Well, the ticket is RHEL-7478 in JIRA now, in fact. :/

:/


-- 
Marc-André Lureau
Re: [PATCH] hw/display/ramfb: plug slight guest-triggerable leak on mode setting
Posted by Laszlo Ersek 7 months ago
On 10/1/23 18:07, Marc-André Lureau wrote:
> Hi Laszlo
> 
> On Sun, Oct 1, 2023 at 4:20 AM Laszlo Ersek <lersek@redhat.com> wrote:
>>
>> On 10/1/23 00:14, Laszlo Ersek wrote:
>>> On 9/29/23 13:17, Marc-André Lureau wrote:
> [..]
>>>> fwiw, my migration support patch is still unreviewed:
>>>> https://patchew.org/QEMU/20230920082651.3349712-1-marcandre.lureau@redhat.com/
>>>>
>>>
>>> I don't have a copy of that patch (not subscribed, sorry...), but how
>>> simply you did it surprises me. I did expect to simulate an fw_cfg write
>>> in post_load, but I thought we'd approach the device (for the sake of
>>> including it in the migration stream) from the higher level device's
>>> vmstate descriptors (dc->vmsd) that set up / depend on ramfb in the
>>> first place. I didn't know that raw vmstate_register() was still accepted.
>>>
>>> If it is, then, for that patch (with Gerd's comment addressed):
>>>
>>> Acked-by: Laszlo Ersek <lersek@redhat.com>
>>
>> I think I may have found one issue with that patch.
>>
>> The fields that we save into the migration stream are integer members of
>> the RAMFBCfg structure that lives directly in the fw_cfg file. The ramfb
>> device specifies those fields for the guest as big endian. This means
>> that when ramfb_fw_cfg_write() runs, triggered by the guest, then on big
>> endian hosts, be32_to_cpu() and be64_to_cpu() will be no-ops, as the
>> integer representation inside the fw_cfg file will match the host CPU at
>> once. And on little endian hosts, these functions will byte swap. In
>> both cases, ramfb_create_display_surface() will have to be called with
>> identical host-side integers. This means that *before* the be32_to_cpu()
>> and be64_to_cpu() calls, the host side *values* read out from the fw_cfg
>> file members *differ* between big endian and little endian hosts.
>>
>> And the problem is that we write precisely those values into the
>> migration stream, via "vmstate_ramfb_cfg". The migration stream
>> represents integers in big endian regardless of host endianness, but the
>> question is the *values* that we encode in BE for the stream. And the
>> values (from fw_cg) will differ between little endian and big endian hosts.
>>
>> Thus, I think we should just use
>>
>>   VMSTATE_BUFFER(cfg, RAMFBState)
>>
>> in "vmstate_ramfb", and remove "vmstate_ramfb_cfg". For migration
>> purposes, we should treat the fw_cfg file as an opaque blob.
> 
> I think I see your point. Using VMSTATE_BUFFER like that doesn't work
> though.

Why not?

(I mean -- does it compile but misbehave, or it doesn't even compile (an
invalid use of the VMSTATE_BUFFER macro)?)

> It's also more error-prone if fields are added in the struct,
> imho.

The structure is effectively the guest-visible register block for the
device. We probably can't add any fields, and even if we do, the new
fields are going to be part of the fw_cfg blob (writeable file), so they
should be covered by VMSTATE_BUFFER just the same.

> 
> However, we could simply have a post-load to convert the values to BE.

post_load itself is not enough; if we want to go this route, then we
need pre_save too. Without a pre_save, the host endianness influences
the data serialized to the migration stream, and there's no way to know
how to recover (the source host's endianness is unknown at load time).

pre_save could work though, if it performed the same BE to host
conversions (to a separate buffer I guess!) as the fw_cfg write callback
does.

> I wonder if new macros couldn't be introduced too.
> 
>>>
>>> BTW: can you please assign
>>> <https://bugzilla.redhat.com/show_bug.cgi?id=1859424> to yourself and
>>> link your patch into it? The reason we ended up duplicating work here is
>>> that noone took RHBZ 1859424 before.
> 
> I thought I did that.
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=1859424#c17

Ouch, sorry. That must have happened since I last looked, and I was
foolish enough not to CC myself on the BZ early on. My mistake!

> 
>>>
>>> ... Well, the ticket is RHEL-7478 in JIRA now, in fact. :/
> 
> :/
> 
> 

Laszlo