Hi Gerd, > Yes, it surely makes sense to go into that direction. > The patch as-is doesn't, it breaks the guest/host interface. > That's ok-ish for a quick proof-of-concept, but clearly not merge-able. > > > TODO: > > - Use Blob resources for getting meta-data such as modifier, format, etc. > > That is pretty much mandatory. Without blob resources there is no concept of resources > shared between host and guest in virtio-gpu, all data is explicitly copied with transfer > commands. [Kasireddy, Vivek] My understanding of virtio-gpu and the concept of resources is still fairly limited but are blob resources really needed for non-Virgl use-cases -- other than something like a dmabuf/scanout blob that shares the meta-data such as modifer? I thought the main motivation for blob resources would be to avoid the explicit copy you mentioned for Virgl workloads. > > Which implies quite a bit of work because we don't have blob resource support in qemu > yet. [Kasireddy, Vivek] I was scrubbing through old mailing list messages to understand the motivation behind blob resources as to why they are needed and came across this: https://gitlab.freedesktop.org/virgl/qemu/-/commits/virtio-gpu-next Does your work above not count for anything? > > > - Test with Virgil rendered BOs to see if this can be used in that case.. > > That also opens up the question how to go forward with virtio-gpu in general. The object > hierarchy we have right now (skipping pci + vga variants for simplicity): > > TYPE_VIRTIO_GPU_BASE (abstract base) > -> TYPE_VIRTIO_GPU (in-qemu implementation) > -> TYPE_VHOST_USER_GPU (vhost-user implementation) > > When compiled with opengl + virgl TYPE_VIRTIO_GPU has a virgl=on/off property. > Having a single device is not ideal for modular builds. > because the hw-display-virtio-gpu.so module has a dependency on ui-opengl.so so that is > needed (due to symbol references) even for the virgl=off case. Also the code is a bit of a > #ifdef mess. > > I think we should split TYPE_VIRTIO_GPU into two devices. Remove > virgl+opengl support from TYPE_VIRTIO_GPU. Add a new > TYPE_VIRTIO_GPU_VIRGL, with either TYPE_VIRTIO_GPU or > TYPE_VIRTIO_GPU_BASE as parent (not sure which is easier), have all opengl/virgl > support code there. > > I think when using opengl it makes sense to also require virgl, so we can use the > virglrenderer library to manage blob resources (even when the actual rendering isn't done > with virgl). Also reduces the complexity and test matrix. [Kasireddy, Vivek] When you say "using opengl" are you referring to the presentation of the rendered buffer via dmabuf or pixman? If yes, I am not sure why this would need to depend on Virgl. For our use-case(s) where we are using virtio-gpu in buffer sharing mode, we'd still need opengl for submitting the dmabuf to UI, IIUC. > > Maybe it even makes sense to deprecate in-qemu virgl support and focus exclusively on > the vhost-user implementation, so we don't have to duplicate all work for both > implementations. [Kasireddy, Vivek] Is the vhost-user implementation better in terms of performance, generally? > > case, how do we make sure that Weston and Qemu UI are not using the same buffer at > any given time? > > There is graphic_hw_gl_block + graphic_hw_gl_flushed for syncronization. > Right now this is only wired up in spice, and it is rather simple (just stalls virgl rendering > instead of providing per-buffer syncronization). [Kasireddy, Vivek] I guess that might work for Virgl rendering but not for our use-case. What we need is a way to tell if the previously submitted dmabuf has been consumed by the Host compositor or not before we release/close it. Weston (wl_buffer.release event and fences) and EGL (sync and fences) do provide few options but I am not sure if GTK lets us use any of those or not. Any recommendations? EGLSync objects? On a different note, any particular reason why Qemu UI EGL implementation is limited to Xorg and not extended to Wayland/Weston for which there is GTK glarea? Thanks, Vivek > > take care, > Gerd
Hi, > > That is pretty much mandatory. Without blob resources there is no concept of resources > > shared between host and guest in virtio-gpu, all data is explicitly copied with transfer > > commands. > [Kasireddy, Vivek] My understanding of virtio-gpu and the concept of resources is still > fairly limited but are blob resources really needed for non-Virgl use-cases -- other than > something like a dmabuf/scanout blob that shares the meta-data such as modifer? I > thought the main motivation for blob resources would be to avoid the explicit copy you > mentioned for Virgl workloads. Well, you want avoid the copy as well, right? With blob resources you can do that in a well defined way, i.e. the guest knows what you are doing and behaves accordingly. Without blob resources you can't, at least not without violating the guests expectation that any changes it does only visible to the host after an explicit transfer (aka copy) command. > > Which implies quite a bit of work because we don't have blob resource support in qemu > > yet. > [Kasireddy, Vivek] I was scrubbing through old mailing list messages to understand the > motivation behind blob resources as to why they are needed and came across this: > https://gitlab.freedesktop.org/virgl/qemu/-/commits/virtio-gpu-next > > Does your work above not count for anything? It is quite old, and I think not up-to-date with the final revision of the blob resource specification. I wouldn't be able to update this in near future due to being busy with other projects. Feel free to grab & update & submit these patches though. > > I think when using opengl it makes sense to also require virgl, so we can use the > > virglrenderer library to manage blob resources (even when the actual rendering isn't done > > with virgl). Also reduces the complexity and test matrix. > [Kasireddy, Vivek] When you say "using opengl" are you referring to the presentation of > the rendered buffer via dmabuf or pixman? If yes, I am not sure why this would need to > depend on Virgl. Well, you can probably do it without virgl as well. But why? Instead of just using the virglrenderer library effectively duplicate the blob resource management bits in qemu? Beside the code duplication this is also a maintainance issue. This adds one more configuration to virtio-gpu. Right now you can build virtio-gpu with virgl (depends on opengl), or you can build without virgl (doesn't use opengl then). I don't think it is a good idea to add a third mode, without virgl support but using opengl for blob dma-bufs. > For our use-case(s) where we are using virtio-gpu in buffer sharing mode, > we'd still need opengl for submitting the dmabuf to UI, IIUC. Correct. When you want use dma-bufs you need opengl. > > Maybe it even makes sense to deprecate in-qemu virgl support and focus exclusively on > > the vhost-user implementation, so we don't have to duplicate all work for both > > implementations. > [Kasireddy, Vivek] Is the vhost-user implementation better in terms of performance, generally? It is better both in terms of security (it's easier to sandbox) and performance. The in-qemu implementation runs in the qemu iothread. Which also handles a bunch of other jobs. Also virglrenderer being busy -- for example with compiling complex shaders -- can block qemu for a while, which in turn can cause latency spikes in the guest. With the vhost-user implementation this is not a problem. Drawback is the extra communication (and synchronization) needed between vhost-user + qemu to make the guest display available via spice or gtk. The latter can possibly be solved by exporting the guest display as pipewire remote desktop (random idea I didn't investigate much yet). > On a different note, any particular reason why Qemu UI EGL > implementation is limited to Xorg and not extended to Wayland/Weston > for which there is GTK glarea? Well, ideally I'd love to just use glarea. Which happens on wayland. The problem with Xorg is that the gtk x11 backend uses glx not egl to create an opengl context for glarea. At least that used to be the case in the past, maybe that has changed with newer versions. qemu needs egl contexts though, otherwise dma-bufs don't work. So we are stuck with our own egl widget implementation for now. Probably we will be able to drop it at some point in the future. HTH, Gerd
Hi Gerd,
Sorry for the delayed response. I wanted to wait until I finished my proof-of-concept --
that included adding synchronization -- to ask follow up questions.
> >
> > Does your work above not count for anything?
>
> It is quite old, and I think not up-to-date with the final revision of the blob resource
> specification. I wouldn't be able to update this in near future due to being busy with other
> projects. Feel free to grab & update & submit these patches though.
[Kasireddy, Vivek] Sure, we'll take a look at your work and use that as a starting
point. Roughly, how much of your work can be reused?
Also, given my limited understanding of how discrete GPUs work, I was wondering how
many copies would there need to be with blob resources/dmabufs and whether a zero-copy
goal would be feasible or not?
>
> Beside the code duplication this is also a maintainance issue. This adds one more
> configuration to virtio-gpu. Right now you can build virtio-gpu with virgl (depends on
> opengl), or you can build without virgl (doesn't use opengl then). I don't think it is a good
> idea to add a third mode, without virgl support but using opengl for blob dma-bufs.
[Kasireddy, Vivek] We'll have to re-visit this part but for our use-case with virtio-gpu, we
are disabling virglrenderer in Qemu and virgl DRI driver in the Guest. However, we still
need to use Opengl/EGL to convert the dmabuf (guest fb) to texture and render as part of
the UI/GTK updates.
>
>
> > On a different note, any particular reason why Qemu UI EGL
> > implementation is limited to Xorg and not extended to Wayland/Weston
> > for which there is GTK glarea?
>
> Well, ideally I'd love to just use glarea. Which happens on wayland.
>
> The problem with Xorg is that the gtk x11 backend uses glx not egl to create an opengl
> context for glarea. At least that used to be the case in the past, maybe that has changed
> with newer versions. qemu needs egl contexts though, otherwise dma-bufs don't work. So
> we are stuck with our own egl widget implementation for now. Probably we will be able
> to drop it at some point in the future.
[Kasireddy, Vivek] GTK X11 backend still uses GLX and it seems like that is not going to
change anytime soon. Having said that, I was wondering if it makes sense to add a new
purely Wayland backend besides GtkGlArea so that Qemu UI can more quickly adopt new
features such as explicit sync. I was thinking about the new backend being similar to this:
https://cgit.freedesktop.org/wayland/weston/tree/clients/simple-dmabuf-egl.c
The reason why I am proposing this idea is because even if we manage to add explicit
sync support to GTK and it gets merged, upgrading Qemu GTK support from 3.22
to > 4.x may prove to be daunting. Currently, the way I am doing explicit sync is
by adding these new APIs to GTK and calling them from Qemu:
static int
create_egl_fence_fd(EGLDisplay dpy)
{
EGLSyncKHR sync = eglCreateSyncKHR(dpy,
EGL_SYNC_NATIVE_FENCE_ANDROID,
NULL);
int fd;
g_assert(sync != EGL_NO_SYNC_KHR);
fd = eglDupNativeFenceFDANDROID(dpy, sync);
g_assert(fd >= 0);
eglDestroySyncKHR(dpy, sync);
return fd;
}
static void
wait_for_buffer_release_fence(EGLDisplay dpy)
{
int ret;
EGLint attrib_list[] = {
EGL_SYNC_NATIVE_FENCE_FD_ANDROID, release_fence_fd,
EGL_NONE,
};
if (release_fence_fd < 0)
return;
EGLSyncKHR sync = eglCreateSyncKHR(dpy,
EGL_SYNC_NATIVE_FENCE_ANDROID,
attrib_list);
g_assert(sync);
release_fence_fd = -1;
eglClientWaitSyncKHR(dpy, sync, 0,
EGL_FOREVER_KHR);
eglDestroySyncKHR(dpy, sync);
}
And, of-course, I am tying the wait above to a dma_fence associated with the
previous guest FB that is signalled to ensure that the Host is done using the FB
thereby providing explicit synchronization between Guest and Host. It seems to
work OK but I was wondering if you had any alternative ideas or suggestions
for doing explicit or implicit sync that are more easier.
Lastly, on a different note, I noticed that there is a virtio-gpu Windows driver here:
https://github.com/virtio-win/kvm-guest-drivers-windows/tree/master/viogpu
We are going to try it out but do you know how up to date it is kept?
Thanks,
Vivek
On Wed, Mar 17, 2021 at 08:28:33AM +0000, Kasireddy, Vivek wrote: > Hi Gerd, > Sorry for the delayed response. I wanted to wait until I finished my proof-of-concept -- > that included adding synchronization -- to ask follow up questions. > > > > > > > Does your work above not count for anything? > > > > It is quite old, and I think not up-to-date with the final revision of the blob resource > > specification. I wouldn't be able to update this in near future due to being busy with other > > projects. Feel free to grab & update & submit these patches though. > [Kasireddy, Vivek] Sure, we'll take a look at your work and use that as a starting > point. Roughly, how much of your work can be reused? There are some small udmabuf support patches which can probably be reused pretty much as-is. Everything else needs larger changes I suspect, but it's been a while I looked at this ... > Also, given my limited understanding of how discrete GPUs work, I was wondering how > many copies would there need to be with blob resources/dmabufs and whether a zero-copy > goal would be feasible or not? Good question. Right now there are two copies (gtk ui): (1) guest ram -> DisplaySurface -> gtk widget (gl=off), or (2) guest ram -> DisplaySurface -> texture (gl=on). You should be able to reduce this to one copy for gl=on ... (3) guest ram -> texture ... by taking DisplaySurface out of the picture, without any changes to the guest/host interface. Drawback is that it requires adding an opengl dependency to virtio-gpu even with virgl=off, because the virtio-gpu device will have to handle the copy to the texture then, in response to guest TRANSFER commands. When adding blob resource support: Easiest is probably supporting VIRTIO_GPU_BLOB_MEM_GUEST (largely identical to non-blob resources) with VIRTIO_GPU_BLOB_FLAG_USE_SHAREABLE (allows the host to create a shared mapping). Then you can go create a udmabuf for the resource on the host side. For the non-gl code path you can mmap() the udmabuf (which gives you a linear mapping for the scattered guest pages) and create a DisplaySurface backed by guest ram pages (removing the guest ram -> DisplaySurface copy). For the gl code path you can create a texture backed by the udmabuf and go render on the host without copying at all. Using VIRTIO_GPU_BLOB_MEM_GUEST + VIRTIO_GPU_BLOB_FLAG_USE_SHAREABLE for resources needs guest changes too, either in mesa (when using virgl) or the kernel driver's dumb buffer handling (when not using virgl). Alternatively (listed more for completeness): You can create a blob resource with VIRTGPU_BLOB_MEM_HOST3D (requires virgl, see also virgl_drm_winsys_resource_create_blob in mesa). It will be allocated by the host, then mapped into the guest using a virtual pci memory bar. Guest userspace (aka mesa driver) can mmap() these resources and has direct, zero-copy access to the host resource. Going to dma-buf export that, import into i915, then let the gpu render implies we are doing p2p dma from a physical (pci-assigned) device to the memory bar of a virtual pci device. Doing that should be possible, but frankly I would be surprised if that actually works out-of-the-box. Dunno how many dragons are lurking here. Could become an interesting challenge to make that fly. > > Beside the code duplication this is also a maintainance issue. This adds one more > > configuration to virtio-gpu. Right now you can build virtio-gpu with virgl (depends on > > opengl), or you can build without virgl (doesn't use opengl then). I don't think it is a good > > idea to add a third mode, without virgl support but using opengl for blob dma-bufs. > [Kasireddy, Vivek] We'll have to re-visit this part but for our use-case with virtio-gpu, we > are disabling virglrenderer in Qemu and virgl DRI driver in the Guest. However, we still > need to use Opengl/EGL to convert the dmabuf (guest fb) to texture and render as part of > the UI/GTK updates. Well, VIRTGPU_BLOB_MEM_HOST3D blob resources are created using virgl renderer commands (VIRGL_CCMD_PIPE_RESOURCE_CREATE). So supporting that without virglrenderer is not an option. VIRTIO_GPU_BLOB_MEM_GUEST might be possible without too much effort. > > > On a different note, any particular reason why Qemu UI EGL > > > implementation is limited to Xorg and not extended to Wayland/Weston > > > for which there is GTK glarea? > > > > Well, ideally I'd love to just use glarea. Which happens on wayland. > > > > The problem with Xorg is that the gtk x11 backend uses glx not egl to create an opengl > > context for glarea. At least that used to be the case in the past, maybe that has changed > > with newer versions. qemu needs egl contexts though, otherwise dma-bufs don't work. So > > we are stuck with our own egl widget implementation for now. Probably we will be able > > to drop it at some point in the future. > [Kasireddy, Vivek] GTK X11 backend still uses GLX and it seems like that is not going to > change anytime soon. Hmm, so the egl backend has to stay for the time being. > Having said that, I was wondering if it makes sense to add a new > purely Wayland backend besides GtkGlArea so that Qemu UI can more quickly adopt new > features such as explicit sync. I was thinking about the new backend being similar to this: > https://cgit.freedesktop.org/wayland/weston/tree/clients/simple-dmabuf-egl.c I'd prefer to not do that. > The reason why I am proposing this idea is because even if we manage to add explicit > sync support to GTK and it gets merged, upgrading Qemu GTK support from 3.22 > to > 4.x may prove to be daunting. Currently, the way I am doing explicit sync is > by adding these new APIs to GTK and calling them from Qemu: Well, we had the same code supporting gtk2+3 with #ifdefs. There are also #ifdefs to avoid using functions deprecated during 3.x lifetime. So I expect porting to gtk4 wouldn't be too bad. Also I expect qemu wouldn't be the only application needing sync support, so trying to get that integrated with upstream gtk certainly makes sense. > Lastly, on a different note, I noticed that there is a virtio-gpu Windows driver here: > https://github.com/virtio-win/kvm-guest-drivers-windows/tree/master/viogpu > > We are going to try it out but do you know how up to date it is kept? No, not following development closely. take care, Gerd
Hi Gerd, Thank you for taking the time to explain how support for blob resources needs to be added. We are going to get started soon and here are the tasks we are planning to do in order of priority: 1) Add support for VIRTIO_GPU_BLOB_MEM_GUEST + VIRTIO_GPU_BLOB_FLAG_USE_SHAREABLE 2) Upgrade Qemu GTK UI from 3.22 to 4.x 3) Add explicit sync support to GTK4 and Qemu UI 4) Add support for VIRTGPU_BLOB_MEM_HOST3D We'll start sending patches as we go along. Thanks, Vivek > > [Kasireddy, Vivek] Sure, we'll take a look at your work and use that > > as a starting point. Roughly, how much of your work can be reused? > > There are some small udmabuf support patches which can probably be reused pretty much > as-is. Everything else needs larger changes I suspect, but it's been a while I looked at this > ... > > > Also, given my limited understanding of how discrete GPUs work, I was > > wondering how many copies would there need to be with blob > > resources/dmabufs and whether a zero-copy goal would be feasible or not? > > Good question. > > Right now there are two copies (gtk ui): > > (1) guest ram -> DisplaySurface -> gtk widget (gl=off), or > (2) guest ram -> DisplaySurface -> texture (gl=on). > > You should be able to reduce this to one copy for gl=on ... > > (3) guest ram -> texture > > ... by taking DisplaySurface out of the picture, without any changes to the guest/host > interface. Drawback is that it requires adding an opengl dependency to virtio-gpu even > with virgl=off, because the virtio-gpu device will have to handle the copy to the texture > then, in response to guest TRANSFER commands. > > When adding blob resource support: > > Easiest is probably supporting VIRTIO_GPU_BLOB_MEM_GUEST (largely identical to > non-blob resources) with VIRTIO_GPU_BLOB_FLAG_USE_SHAREABLE > (allows the host to create a shared mapping). Then you can go create a udmabuf for the > resource on the host side. For the non-gl code path you can mmap() the udmabuf (which > gives you a linear mapping for the scattered guest pages) and create a DisplaySurface > backed by guest ram pages (removing the guest ram -> DisplaySurface copy). For the gl > code path you can create a texture backed by the udmabuf and go render on the host > without copying at all. > > Using VIRTIO_GPU_BLOB_MEM_GUEST + > VIRTIO_GPU_BLOB_FLAG_USE_SHAREABLE for resources needs guest changes too, > either in mesa (when using virgl) or the kernel driver's dumb buffer handling (when not > using virgl). > > Alternatively (listed more for completeness): > > You can create a blob resource with VIRTGPU_BLOB_MEM_HOST3D (requires virgl, > see also virgl_drm_winsys_resource_create_blob in mesa). It will be allocated by the > host, then mapped into the guest using a virtual pci memory bar. Guest userspace (aka > mesa driver) can mmap() these resources and has direct, zero-copy access to the host > resource. > > Going to dma-buf export that, import into i915, then let the gpu render implies we are > doing p2p dma from a physical (pci-assigned) device to the memory bar of a virtual pci > device. > > Doing that should be possible, but frankly I would be surprised if that actually works out- > of-the-box. Dunno how many dragons are lurking here. > Could become an interesting challenge to make that fly. > > > > Beside the code duplication this is also a maintainance issue. This > > > adds one more configuration to virtio-gpu. Right now you can build > > > virtio-gpu with virgl (depends on opengl), or you can build without > > > virgl (doesn't use opengl then). I don't think it is a good idea to add a third mode, > without virgl support but using opengl for blob dma-bufs. > > [Kasireddy, Vivek] We'll have to re-visit this part but for our > > use-case with virtio-gpu, we are disabling virglrenderer in Qemu and > > virgl DRI driver in the Guest. However, we still need to use > > Opengl/EGL to convert the dmabuf (guest fb) to texture and render as part of the > UI/GTK updates. > > Well, VIRTGPU_BLOB_MEM_HOST3D blob resources are created using virgl renderer > commands (VIRGL_CCMD_PIPE_RESOURCE_CREATE). So supporting that without > virglrenderer is not an option. > > VIRTIO_GPU_BLOB_MEM_GUEST might be possible without too much effort. > > > > > On a different note, any particular reason why Qemu UI EGL > > > > implementation is limited to Xorg and not extended to > > > > Wayland/Weston for which there is GTK glarea? > > > > > > Well, ideally I'd love to just use glarea. Which happens on wayland. > > > > > > The problem with Xorg is that the gtk x11 backend uses glx not egl > > > to create an opengl context for glarea. At least that used to be > > > the case in the past, maybe that has changed with newer versions. > > > qemu needs egl contexts though, otherwise dma-bufs don't work. So > > > we are stuck with our own egl widget implementation for now. Probably we will be > able to drop it at some point in the future. > > > [Kasireddy, Vivek] GTK X11 backend still uses GLX and it seems like > > that is not going to change anytime soon. > > Hmm, so the egl backend has to stay for the time being. > > > Having said that, I was wondering if it makes sense to add a new > > purely Wayland backend besides GtkGlArea so that Qemu UI can more > > quickly adopt new features such as explicit sync. I was thinking about the new backend > being similar to this: > > https://cgit.freedesktop.org/wayland/weston/tree/clients/simple-dmabuf > > -egl.c > > I'd prefer to not do that. > > > The reason why I am proposing this idea is because even if we manage > > to add explicit sync support to GTK and it gets merged, upgrading Qemu > > GTK support from 3.22 to > 4.x may prove to be daunting. Currently, > > the way I am doing explicit sync is by adding these new APIs to GTK and calling them > from Qemu: > > Well, we had the same code supporting gtk2+3 with #ifdefs. There are also #ifdefs to > avoid using functions deprecated during 3.x lifetime. > So I expect porting to gtk4 wouldn't be too bad. > > Also I expect qemu wouldn't be the only application needing sync support, so trying to get > that integrated with upstream gtk certainly makes sense. > > > Lastly, on a different note, I noticed that there is a virtio-gpu Windows driver here: > > https://github.com/virtio-win/kvm-guest-drivers-windows/tree/master/vi > > ogpu > > > > We are going to try it out but do you know how up to date it is kept? > > No, not following development closely. > > take care, > Gerd
© 2016 - 2025 Red Hat, Inc.