docs/system/devices/virtio-gpu.rst | 123 +++++++++++++++++++++- hw/display/virtio-gpu-gl.c | 5 + hw/display/virtio-gpu-virgl.c | 164 ++++++++++++++++++++++++++++- hw/display/virtio-gpu.c | 15 +++ include/hw/virtio/virtio-gpu.h | 16 +++ include/ui/sdl2.h | 7 ++ meson.build | 6 +- ui/gtk-egl.c | 1 - ui/gtk-gl-area.c | 1 - ui/sdl2-gl.c | 68 +++++++++++- ui/sdl2.c | 42 ++++++++ 11 files changed, 437 insertions(+), 11 deletions(-)
This patchset adds DRM native context support to VirtIO-GPU on Qemu. Contarary to Virgl and Venus contexts that mediates high level GFX APIs, DRM native context [1] mediates lower level kernel driver UAPI, which reflects in a less CPU overhead and less/simpler code needed to support it. DRM context consists of a host and guest parts that have to be implemented for each GPU driver. On a guest side, DRM context presents a virtual GPU as a real/native host GPU device for GL/VK applications. [1] https://www.youtube.com/watch?v=9sFP_yddLLQ Today there are four DRM native context drivers existing in a wild: - Freedreno (Qualcomm SoC GPUs), completely upstreamed - AMDGPU, completely upstreamed - Intel (i915), merge requests are opened - Asahi (Apple SoC GPUs), partially merged upstream # How to try out DRM context: 1. DRM context uses host blobs and on host requires latest 6.13 version of Linux kernel that contains necessary KVM fixes. 2. Use latest Mesa (both guest and host) and libvirglrenderer versions. Use build flags documented in tha patch #10 of this series. 3. On guest, use latest Linux kernel v6.14-rc or newer. Example Qemu cmdline that enables DRM context: qemu-system-x86_64 -device virtio-vga-gl,hostmem=4G,blob=on,drm_native_context=on \ -machine q35,accel=kvm,memory-backend=mem1 \ -object memory-backend-memfd,id=mem1,size=8G -m 8G # Note about known performance problem in Qemu: DRM contexts are mapping host blobs extensively and these mapping operations work slowly in Qemu. Exact reason is unknown. Mappings work fast on Crosvm For DRM contexts this problem is more visible than for Venus/Virgl. Changelog: v6: - Fixed compilation warning using older version of virglrenderer, which wasn't fixed properly in v5. - Added t-bs from Alex Bennée. - Added patches to improve virgl/venus doc by adding links to the Mesa doc as was suggested by Akihiko Odaki. - Updated patch that documents guest/host requirements. Added links to Asahi nctx and reworked the doc structure by adding requirements to each context-type section instead of having one big blob or requirements, which was objected by Akihiko Odaki. v5: - Added r-bs from Akihiko Odaki. - Added acks from Michael Tsirkin. - Fixed compilation warning using older version of virglrenderer that was reported by Alex Bennée. Noticed that I need to keep old virgl_write_fence() code around for the older virglrenderer in "Support asynchronous fencing" patch, so added it back and verified that old virglrenderer works properly. - Added new patch from Alex Bennée that adds more virtio-gpu documentation with a couple corrections and additions to it from me. - Rebased patches on top of latest staging tree. v4: - Improved SDL2/dmabuf patch by reusing existing Meson X11 config option, better handling EGL error and extending comment telling that it's safe to enable SDL2 EGL preference hint. As was suggested by Akihiko Odaki. - Replaced another QSLIST_FOREACH_SAFE with QSLIST_EMPTY+FIRST in the async-fencing patch for more consistency of the code. As was suggested by Akihiko Odaki. - Added missing braces around if-statement that was spotted by Alex Bennée. - Renamed 'drm=on' option of virtio-gpu-gl device to 'drm_native_context=on' for more clarity as was suggested by Alex Bennée. Haven't added added new context-type option that was also proposed by Alex, might do it with a separate patch. This context-type option will duplicate and depecate existing options, but in a longer run likely will be worthwhile adding it. - Dropped Linux headers-update patch as headers has been updated in the staging tree. v3: - Improved EGL presence-check code on X11 systems for the SDL2 hint that prefers EGL over GLX by using better ifdefs and checking Xlib presence at a build time to avoid build failure if lib SDL2 and system are configured with a disabled X11 support. Also added clarifying comment telling that X11 hint doesn't affect Wayland systems. Suggested by Akihiko Odaki. - Corrected strerror(err) that used negative error where it should be positive and vice versa that was caught by Akihiko Odaki. Added clarifying comment for the case where we get positive error code from virglrenderer that differs from other virglrenderer API functions. - Improved QSLIST usage by dropping mutex protecting the async fence list and using atomic variant of QSLIST helpers instead. Switched away from using FOREACH helper to improve readability of the code, showing that we don't precess list in unoptimal way. Like was suggested by Akihiko Odaki. - Updated patchset base to Venus v18. v2: - Updated SDL2-dmabuf patch by making use of error_report() and checking presense of X11+EGL in the system before making SDL2 to prefer EGL backend over GLX, suggested by Akihiko Odaki. - Improved SDL2's dmabuf-presence check that wasn't done properly in v1, where EGL was set up only after first console was fully inited, and thus, SDL's display .has_dmabuf callback didn't work for the first console. Now dmabuf support status is pre-checked before console is registered. - Updated commit description of the patch that fixes SDL2's context switching logic with a more detailed explanation of the problem. Suggested by Akihiko Odaki. - Corrected rebase typo in the async-fencing patch and switched async-fencing to use a sigle-linked list instead of the double, as was suggested by Akihiko Odaki. - Replaced "=true" with "=on" in the DRM native context documentation example and made virtio_gpu_virgl_init() to fail with a error message if DRM context can't be initialized instead of giving a warning message, as was suggested by Akihiko Odaki. - Added patchew's dependecy tag to the cover letter as was suggested by Akihiko Odaki. Alex Bennée (1): docs/system: virtio-gpu: Document host/guest requirements Dmitry Osipenko (8): ui/sdl2: Restore original context after new context creation virtio-gpu: Handle virgl fence creation errors virtio-gpu: Support asynchronous fencing virtio-gpu: Support DRM native context ui/sdl2: Don't disable scanout when display is refreshed ui/gtk: Don't disable scanout when display is refreshed docs/system: virtio-gpu: Add link to Mesa VirGL doc docs/system: virtio-gpu: Update Venus link Pierre-Eric Pelloux-Prayer (1): ui/sdl2: Implement dpy dmabuf functions docs/system/devices/virtio-gpu.rst | 123 +++++++++++++++++++++- hw/display/virtio-gpu-gl.c | 5 + hw/display/virtio-gpu-virgl.c | 164 ++++++++++++++++++++++++++++- hw/display/virtio-gpu.c | 15 +++ include/hw/virtio/virtio-gpu.h | 16 +++ include/ui/sdl2.h | 7 ++ meson.build | 6 +- ui/gtk-egl.c | 1 - ui/gtk-gl-area.c | 1 - ui/sdl2-gl.c | 68 +++++++++++- ui/sdl2.c | 42 ++++++++ 11 files changed, 437 insertions(+), 11 deletions(-) -- 2.47.1
Dmitry Osipenko <dmitry.osipenko@collabora.com> writes: > This patchset adds DRM native context support to VirtIO-GPU on Qemu. > > Contarary to Virgl and Venus contexts that mediates high level GFX APIs, > DRM native context [1] mediates lower level kernel driver UAPI, which > reflects in a less CPU overhead and less/simpler code needed to support it. > DRM context consists of a host and guest parts that have to be implemented > for each GPU driver. On a guest side, DRM context presents a virtual GPU as > a real/native host GPU device for GL/VK applications. > <snip> So first the good news. I can now get this up and running (x86/kvm guest with Intel graphics) and as far as I can tell the native context mode is working. With Dongwon Kim's patch the mirroring/corruption I was seeing is gone. I can successfully run glmark2-wayland (although see bellow) but vkmark completely fails to start reporting: MESA: info: virtgpu backend not enabling VIRTGPU_PARAM_CREATE_FENCE_PASSING MESA: info: virtgpu backend not enabling VIRTGPU_PARAM_CREATE_GUEST_HANDLE MESA: error: DRM_IOCTL_VIRTGPU_GET_CAPS failed with Invalid argument MESA: error: DRM_IOCTL_VIRTGPU_CONTEXT_INIT failed with Invalid argument, continuing without context... MESA: error: DRM_VIRTGPU_RESOURCE_CREATE_BLOB failed with No space left on device MESA: error: Failed to create virtgpu AddressSpaceStream MESA: error: vulkan: Failed to get host connection MESA: error: DRM_VIRTGPU_RESOURCE_CREATE_BLOB failed with No space left on device MESA: error: Failed to create virtgpu AddressSpaceStream MESA: error: vulkan: Failed to get host connection MESA: error: DRM_VIRTGPU_RESOURCE_CREATE_BLOB failed with No space left on device MESA: error: Failed to create virtgpu AddressSpaceStream MESA: error: vulkan: Failed to get host connection MESA: warning: ../src/gfxstream/guest/vulkan/gfxstream_vk_device.cpp:681: VK_ERROR_DEVICE_LOST MESA: error: DRM_VIRTGPU_RESOURCE_CREATE_BLOB failed with No space left on device MESA: error: Failed to create virtgpu AddressSpaceStream MESA: error: vulkan: Failed to get host connection MESA: warning: ../src/gfxstream/guest/vulkan/gfxstream_vk_device.cpp:332: VK_ERROR_DEVICE_LOST === Physical Device 0 === Vendor ID: 0x8086 Device ID: 0xA780 Device Name: Intel(R) Graphics (RPL-S) Driver Version: 101068899 Device UUID: b39e1cf39b101489e3c6039406f78d6c I was booting with 4G of shared memory. Later versions of vkmark (2025.01) fail due to missing the VK_KHR_display extension required as of https://github.com/vkmark/vkmark/commit/7c3189c6482cb84c3c0e69d6dabb9d80e0c0092a > # Note about known performance problem in Qemu: > > DRM contexts are mapping host blobs extensively and these mapping > operations work slowly in Qemu. Exact reason is unknown. Mappings work > fast on Crosvm For DRM contexts this problem is more visible than for > Venus/Virgl. And how! With drm_native I get a lot of stutter while running and barely 100FPS (compared to ~8000 on pure venus). IMHO we need to figure out why there is such a discrepancy before merging because currently it makes more sense to use Venus. <snip> I'll do some more testing with my AMD/Aarch64 rig next week. -- Alex Bennée Virtualisation Tech Lead @ Linaro
On 2/14/25 17:33, Alex Bennée wrote: > Dmitry Osipenko <dmitry.osipenko@collabora.com> writes: > >> This patchset adds DRM native context support to VirtIO-GPU on Qemu. >> >> Contarary to Virgl and Venus contexts that mediates high level GFX APIs, >> DRM native context [1] mediates lower level kernel driver UAPI, which >> reflects in a less CPU overhead and less/simpler code needed to support it. >> DRM context consists of a host and guest parts that have to be implemented >> for each GPU driver. On a guest side, DRM context presents a virtual GPU as >> a real/native host GPU device for GL/VK applications. >> > <snip> > > So first the good news. I can now get this up and running (x86/kvm guest > with Intel graphics) and as far as I can tell the native context mode is > working. With Dongwon Kim's patch the mirroring/corruption I was seeing > is gone. > > I can successfully run glmark2-wayland (although see bellow) but vkmark > completely fails to start reporting: > > MESA: info: virtgpu backend not enabling VIRTGPU_PARAM_CREATE_FENCE_PASSING > MESA: info: virtgpu backend not enabling VIRTGPU_PARAM_CREATE_GUEST_HANDLE > MESA: error: DRM_IOCTL_VIRTGPU_GET_CAPS failed with Invalid argument > MESA: error: DRM_IOCTL_VIRTGPU_CONTEXT_INIT failed with Invalid argument, continuing without context... > MESA: error: DRM_VIRTGPU_RESOURCE_CREATE_BLOB failed with No space left on device > MESA: error: Failed to create virtgpu AddressSpaceStream > MESA: error: vulkan: Failed to get host connection > MESA: error: DRM_VIRTGPU_RESOURCE_CREATE_BLOB failed with No space left on device > MESA: error: Failed to create virtgpu AddressSpaceStream > MESA: error: vulkan: Failed to get host connection > MESA: error: DRM_VIRTGPU_RESOURCE_CREATE_BLOB failed with No space left on device > MESA: error: Failed to create virtgpu AddressSpaceStream > MESA: error: vulkan: Failed to get host connection > MESA: warning: ../src/gfxstream/guest/vulkan/gfxstream_vk_device.cpp:681: VK_ERROR_DEVICE_LOST > MESA: error: DRM_VIRTGPU_RESOURCE_CREATE_BLOB failed with No space left on device > MESA: error: Failed to create virtgpu AddressSpaceStream > MESA: error: vulkan: Failed to get host connection > MESA: warning: ../src/gfxstream/guest/vulkan/gfxstream_vk_device.cpp:332: VK_ERROR_DEVICE_LOST > === Physical Device 0 === > Vendor ID: 0x8086 > Device ID: 0xA780 > Device Name: Intel(R) Graphics (RPL-S) > Driver Version: 101068899 > Device UUID: b39e1cf39b101489e3c6039406f78d6c > > I was booting with 4G of shared memory. Thanks for the testing. I assume all these errors are generated by the failing gfxstream. Hence, may ignore them since you don't have enabled gfxstream. > Later versions of vkmark (2025.01) fail due to missing the > VK_KHR_display extension required as of > https://github.com/vkmark/vkmark/commit/7c3189c6482cb84c3c0e69d6dabb9d80e0c0092a This VK_KHR_display problem is only reproducible with your rootfs that you shared with me. It could be a trouble with your build configs or a buggy package version used by your rootfs build, more likely the former. >> # Note about known performance problem in Qemu: >> >> DRM contexts are mapping host blobs extensively and these mapping >> operations work slowly in Qemu. Exact reason is unknown. Mappings work >> fast on Crosvm For DRM contexts this problem is more visible than for >> Venus/Virgl. > > And how! > > With drm_native I get a lot of stutter while running and barely 100FPS > (compared to ~8000 on pure venus). IMHO we need to figure out why there > is such a discrepancy before merging because currently it makes more > sense to use If you'd run with Xorg/Wayland directly without a DE, then it should work okay. This should be a problem with unmapping performance that I'm thinking about. That unmapping problem is partially understood. Unmapping code works correctly, but we'll need to optimize the flatview code to perform unmapping immediately. Meanwhile, you may apply the QEMU hack below, it should resolve most of the stutter, please let me know if it helps. There is also a pending Mesa intel-virtio blob mapping optimization that currently isn't available in my gitlab code, I'll refresh that feature and then ask you to try it. Could be that there is more to the unmapping perf issue in QEMU. I'm investigating. AMDGPU nctx is less affected by the bad unmapping performance. I expect it will work well for you. diff --git a/util/rcu.c b/util/rcu.c index fa32c942e4bb..aac3522c323c 100644 --- a/util/rcu.c +++ b/util/rcu.c @@ -174,7 +174,7 @@ void synchronize_rcu(void) } -#define RCU_CALL_MIN_SIZE 30 +#define RCU_CALL_MIN_SIZE 1 /* Multi-producer, single-consumer queue based on urcu/static/wfqueue.h * from liburcu. Note that head is only used by the consumer. @@ -267,7 +267,7 @@ static void *call_rcu_thread(void *opaque) * added before synchronize_rcu() starts. */ while (n == 0 || (n < RCU_CALL_MIN_SIZE && ++tries <= 5)) { - g_usleep(10000); + g_usleep(1000); if (n == 0) { qemu_event_reset(&rcu_call_ready_event); n = qatomic_read(&rcu_call_count); -- Best regards, Dmitry
Dmitry Osipenko <dmitry.osipenko@collabora.com> writes: > On 2/14/25 17:33, Alex Bennée wrote: >> Dmitry Osipenko <dmitry.osipenko@collabora.com> writes: >> >>> This patchset adds DRM native context support to VirtIO-GPU on Qemu. >>> >>> Contarary to Virgl and Venus contexts that mediates high level GFX APIs, >>> DRM native context [1] mediates lower level kernel driver UAPI, which >>> reflects in a less CPU overhead and less/simpler code needed to support it. >>> DRM context consists of a host and guest parts that have to be implemented >>> for each GPU driver. On a guest side, DRM context presents a virtual GPU as >>> a real/native host GPU device for GL/VK applications. >>> >> <snip> >> >> So first the good news. I can now get this up and running (x86/kvm guest >> with Intel graphics) and as far as I can tell the native context mode is >> working. With Dongwon Kim's patch the mirroring/corruption I was seeing >> is gone. >> >> I can successfully run glmark2-wayland (although see bellow) but vkmark >> completely fails to start reporting: >> >> MESA: info: virtgpu backend not enabling VIRTGPU_PARAM_CREATE_FENCE_PASSING >> MESA: info: virtgpu backend not enabling VIRTGPU_PARAM_CREATE_GUEST_HANDLE >> MESA: error: DRM_IOCTL_VIRTGPU_GET_CAPS failed with Invalid argument >> MESA: error: DRM_IOCTL_VIRTGPU_CONTEXT_INIT failed with Invalid argument, continuing without context... >> MESA: error: DRM_VIRTGPU_RESOURCE_CREATE_BLOB failed with No space left on device >> MESA: error: Failed to create virtgpu AddressSpaceStream >> MESA: error: vulkan: Failed to get host connection >> MESA: error: DRM_VIRTGPU_RESOURCE_CREATE_BLOB failed with No space left on device >> MESA: error: Failed to create virtgpu AddressSpaceStream >> MESA: error: vulkan: Failed to get host connection >> MESA: error: DRM_VIRTGPU_RESOURCE_CREATE_BLOB failed with No space left on device >> MESA: error: Failed to create virtgpu AddressSpaceStream >> MESA: error: vulkan: Failed to get host connection >> MESA: warning: ../src/gfxstream/guest/vulkan/gfxstream_vk_device.cpp:681: VK_ERROR_DEVICE_LOST >> MESA: error: DRM_VIRTGPU_RESOURCE_CREATE_BLOB failed with No space left on device >> MESA: error: Failed to create virtgpu AddressSpaceStream >> MESA: error: vulkan: Failed to get host connection >> MESA: warning: ../src/gfxstream/guest/vulkan/gfxstream_vk_device.cpp:332: VK_ERROR_DEVICE_LOST >> === Physical Device 0 === >> Vendor ID: 0x8086 >> Device ID: 0xA780 >> Device Name: Intel(R) Graphics (RPL-S) >> Driver Version: 101068899 >> Device UUID: b39e1cf39b101489e3c6039406f78d6c >> >> I was booting with 4G of shared memory. > > Thanks for the testing. > > I assume all these errors are generated by the failing gfxstream. Hence, > may ignore them since you don't have enabled gfxstream. > >> Later versions of vkmark (2025.01) fail due to missing the >> VK_KHR_display extension required as of >> https://github.com/vkmark/vkmark/commit/7c3189c6482cb84c3c0e69d6dabb9d80e0c0092a > > This VK_KHR_display problem is only reproducible with your rootfs that > you shared with me. It could be a trouble with your build configs or a > buggy package version used by your rootfs build, more likely the > former. So you have built that latest vkmark? This is a recent addition to vkmark for the 2025.1 release. Does vulkaninfo --summary show the extension available for you? It is certainly available on the host side: VK_KHR_display : extension revision 23 >>> # Note about known performance problem in Qemu: >>> >>> DRM contexts are mapping host blobs extensively and these mapping >>> operations work slowly in Qemu. Exact reason is unknown. Mappings work >>> fast on Crosvm For DRM contexts this problem is more visible than for >>> Venus/Virgl. >> >> And how! >> >> With drm_native I get a lot of stutter while running and barely 100FPS >> (compared to ~8000 on pure venus). IMHO we need to figure out why there >> is such a discrepancy before merging because currently it makes more >> sense to use > If you'd run with Xorg/Wayland directly without a DE, then it should > work okay. This should be a problem with unmapping performance that I'm > thinking about. > > That unmapping problem is partially understood. Unmapping code works > correctly, but we'll need to optimize the flatview code to perform > unmapping immediately. Why immediately? Surely if we are unmapping we can defer it. Or is this a case of having stale mappings making the life of new allocations harder? > Meanwhile, you may apply the QEMU hack below, it > should resolve most of the stutter, please let me know if it helps. > > There is also a pending Mesa intel-virtio blob mapping optimization that > currently isn't available in my gitlab code, I'll refresh that feature > and then ask you to try it. > > Could be that there is more to the unmapping perf issue in QEMU. I'm > investigating. > > AMDGPU nctx is less affected by the bad unmapping performance. I expect > it will work well for you. > > > > diff --git a/util/rcu.c b/util/rcu.c > index fa32c942e4bb..aac3522c323c 100644 > --- a/util/rcu.c > +++ b/util/rcu.c > @@ -174,7 +174,7 @@ void synchronize_rcu(void) > } > > > -#define RCU_CALL_MIN_SIZE 30 > +#define RCU_CALL_MIN_SIZE 1 > > /* Multi-producer, single-consumer queue based on urcu/static/wfqueue.h > * from liburcu. Note that head is only used by the consumer. > @@ -267,7 +267,7 @@ static void *call_rcu_thread(void *opaque) > * added before synchronize_rcu() starts. > */ > while (n == 0 || (n < RCU_CALL_MIN_SIZE && ++tries <= 5)) { > - g_usleep(10000); > + g_usleep(1000); > if (n == 0) { > qemu_event_reset(&rcu_call_ready_event); > n = qatomic_read(&rcu_call_count); -- Alex Bennée Virtualisation Tech Lead @ Linaro
On 2/17/25 18:22, Alex Bennée wrote: ... >> This VK_KHR_display problem is only reproducible with your rootfs that >> you shared with me. It could be a trouble with your build configs or a >> buggy package version used by your rootfs build, more likely the >> former. > So you have built that latest vkmark? This is a recent addition to > vkmark for the 2025.1 release. Yes, latest 2025.1 from git/master. > Does vulkaninfo --summary show the extension available for you? It is > certainly available on the host side: > > VK_KHR_display : extension revision 23 > Have it on guest with my rootfs, not with yours. I'd suspect problem is with the your Mesa build flags, maybe you haven't enabled necessary flags related to WSI. .. >>> With drm_native I get a lot of stutter while running and barely 100FPS >>> (compared to ~8000 on pure venus). IMHO we need to figure out why there >>> is such a discrepancy before merging because currently it makes more >>> sense to use >> If you'd run with Xorg/Wayland directly without a DE, then it should >> work okay. This should be a problem with unmapping performance that I'm >> thinking about. >> >> That unmapping problem is partially understood. Unmapping code works >> correctly, but we'll need to optimize the flatview code to perform >> unmapping immediately. > Why immediately? Surely if we are unmapping we can defer it. Or is this > a case of having stale mappings making the life of new allocations > harder? Unmapping currently works synchronously for virtio-gpu in QEMU, hence deferring it blocks whole virtio-gpu up to 100+ ms. And if multiple unmappings are done in a row, then it's 100ms multiplied by the number of unmappings. -- Best regards, Dmitry
Dmitry Osipenko <dmitry.osipenko@collabora.com> writes: > On 2/17/25 18:22, Alex Bennée wrote: > ... >>> This VK_KHR_display problem is only reproducible with your rootfs that >>> you shared with me. It could be a trouble with your build configs or a >>> buggy package version used by your rootfs build, more likely the >>> former. >> So you have built that latest vkmark? This is a recent addition to >> vkmark for the 2025.1 release. > > Yes, latest 2025.1 from git/master. > >> Does vulkaninfo --summary show the extension available for you? It is >> certainly available on the host side: >> >> VK_KHR_display : extension revision 23 >> > > Have it on guest with my rootfs, not with yours. I'd suspect problem is > with the your Mesa build flags, maybe you haven't enabled necessary > flags related to WSI. I can't see any reference in the buildroot recipes. What is your mesa's build flags? > > .. >>>> With drm_native I get a lot of stutter while running and barely 100FPS >>>> (compared to ~8000 on pure venus). IMHO we need to figure out why there >>>> is such a discrepancy before merging because currently it makes more >>>> sense to use >>> If you'd run with Xorg/Wayland directly without a DE, then it should >>> work okay. This should be a problem with unmapping performance that I'm >>> thinking about. >>> >>> That unmapping problem is partially understood. Unmapping code works >>> correctly, but we'll need to optimize the flatview code to perform >>> unmapping immediately. >> Why immediately? Surely if we are unmapping we can defer it. Or is this >> a case of having stale mappings making the life of new allocations >> harder? > > Unmapping currently works synchronously for virtio-gpu in QEMU, hence > deferring it blocks whole virtio-gpu up to 100+ ms. And if multiple > unmappings are done in a row, then it's 100ms multiplied by the number > of unmappings. -- Alex Bennée Virtualisation Tech Lead @ Linaro
Dmitry Osipenko <dmitry.osipenko@collabora.com> writes: > This patchset adds DRM native context support to VirtIO-GPU on Qemu. > > Contarary to Virgl and Venus contexts that mediates high level GFX APIs, > DRM native context [1] mediates lower level kernel driver UAPI, which > reflects in a less CPU overhead and less/simpler code needed to support it. > DRM context consists of a host and guest parts that have to be implemented > for each GPU driver. On a guest side, DRM context presents a virtual GPU as > a real/native host GPU device for GL/VK applications. > > [1] https://www.youtube.com/watch?v=9sFP_yddLLQ > > Today there are four DRM native context drivers existing in a wild: > > - Freedreno (Qualcomm SoC GPUs), completely upstreamed > - AMDGPU, completely upstreamed Well good news and bad news. I can verify that AMD native context works when I run my Aarch64 guest on my Aarch64 host with -accel TCG (therefor avoiding KVM all together). I get potato frame rates though (~150FPS) although I suspect that is because the PCI errata workaround. When it comes to graphics memory allocation is there anything I can do to force all allocations to be very aligned? Is this in the purview of the AMD drm drivers or TTM itself? I'm still seeing corruption with -display gtk,gl=on on my x86 system BTW. I would like to understand if that is a problem with QEMU, GTK or something else in the stack before we merge. > - Intel (i915), merge requests are opened > - Asahi (Apple SoC GPUs), partially merged upstream > <snip> -- Alex Bennée Virtualisation Tech Lead @ Linaro
On 1/27/25 19:17, Alex Bennée wrote: ... > I'm still seeing corruption with -display gtk,gl=on on my x86 system > BTW. I would like to understand if that is a problem with QEMU, GTK or > something else in the stack before we merge. I reproduced the display mirroring/corruption issue and bisected it to the following commit. The problem only happens when QEMU/GTK uses Wayland display directly, while previously I was running QEMU with XWayland that doesn't have the problem. Why this change breaks dmabuf displaying with Wayland/GTK is unclear. Reverting commit fixes the bug. +Dongwon Kim +Vivek Kasireddy commit 77bf310084dad38b3a2badf01766c659056f1cf2 Author: Dongwon Kim <dongwon.kim@intel.com> Date: Fri Apr 26 15:50:59 2024 -0700 ui/gtk: Draw guest frame at refresh cycle Draw routine needs to be manually invoked in the next refresh if there is a scanout blob from the guest. This is to prevent a situation where there is a scheduled draw event but it won't happen bacause the window is currently in inactive state (minimized or tabified). If draw is not done for a long time, gl_block timeout and/or fence timeout (on the guest) will happen eventually. v2: Use gd_gl_area_draw(vc) in gtk-gl-area.c Suggested-by: Vivek Kasireddy <vivek.kasireddy@intel.com> Cc: Gerd Hoffmann <kraxel@redhat.com> Cc: Marc-André Lureau <marcandre.lureau@redhat.com> Cc: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dongwon Kim <dongwon.kim@intel.com> Acked-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20240426225059.3871283-1-dongwon.kim@intel.com> -- Best regards, Dmitry
Dmitry Osipenko <dmitry.osipenko@collabora.com> writes: > On 1/27/25 19:17, Alex Bennée wrote: > ... >> I'm still seeing corruption with -display gtk,gl=on on my x86 system >> BTW. I would like to understand if that is a problem with QEMU, GTK or >> something else in the stack before we merge. > > I reproduced the display mirroring/corruption issue and bisected it to > the following commit. The problem only happens when QEMU/GTK uses > Wayland display directly, while previously I was running QEMU with > XWayland that doesn't have the problem. Why this change breaks dmabuf > displaying with Wayland/GTK is unclear. Ahh that makes sense - I obviously forgot to mention I'm running sway/wayland across both machines. > Reverting commit fixes the bug. > > +Dongwon Kim +Vivek Kasireddy > > commit 77bf310084dad38b3a2badf01766c659056f1cf2 > Author: Dongwon Kim <dongwon.kim@intel.com> > Date: Fri Apr 26 15:50:59 2024 -0700 > > ui/gtk: Draw guest frame at refresh cycle > > Draw routine needs to be manually invoked in the next refresh > if there is a scanout blob from the guest. This is to prevent > a situation where there is a scheduled draw event but it won't > happen bacause the window is currently in inactive state > (minimized or tabified). If draw is not done for a long time, > gl_block timeout and/or fence timeout (on the guest) will happen > eventually. > > v2: Use gd_gl_area_draw(vc) in gtk-gl-area.c > > Suggested-by: Vivek Kasireddy <vivek.kasireddy@intel.com> > Cc: Gerd Hoffmann <kraxel@redhat.com> > Cc: Marc-André Lureau <marcandre.lureau@redhat.com> > Cc: Daniel P. Berrangé <berrange@redhat.com> > Signed-off-by: Dongwon Kim <dongwon.kim@intel.com> > Acked-by: Marc-André Lureau <marcandre.lureau@redhat.com> > Message-Id: <20240426225059.3871283-1-dongwon.kim@intel.com> Maybe a race on: QemuDmaBuf *dmabuf = vc->gfx.guest_fb.dmabuf; ? -- Alex Bennée Virtualisation Tech Lead @ Linaro
On 1/27/25 19:17, Alex Bennée wrote: > Dmitry Osipenko <dmitry.osipenko@collabora.com> writes: > >> This patchset adds DRM native context support to VirtIO-GPU on Qemu. >> >> Contarary to Virgl and Venus contexts that mediates high level GFX APIs, >> DRM native context [1] mediates lower level kernel driver UAPI, which >> reflects in a less CPU overhead and less/simpler code needed to support it. >> DRM context consists of a host and guest parts that have to be implemented >> for each GPU driver. On a guest side, DRM context presents a virtual GPU as >> a real/native host GPU device for GL/VK applications. >> >> [1] https://www.youtube.com/watch?v=9sFP_yddLLQ >> >> Today there are four DRM native context drivers existing in a wild: >> >> - Freedreno (Qualcomm SoC GPUs), completely upstreamed >> - AMDGPU, completely upstreamed > > Well good news and bad news. > > I can verify that AMD native context works when I run my Aarch64 guest > on my Aarch64 host with -accel TCG (therefor avoiding KVM all together). > I get potato frame rates though (~150FPS) although I suspect that is > because the PCI errata workaround. > > When it comes to graphics memory allocation is there anything I can do > to force all allocations to be very aligned? Is this in the purview of > the AMD drm drivers or TTM itself? All GPU allocations should be aligned to a page size. Alignment is specified by AMD driver. I don't expect that alignment is the problem. What's the size of your host and guest pages? -- Best regards, Dmitry
Dmitry Osipenko <dmitry.osipenko@collabora.com> writes: > On 1/27/25 19:17, Alex Bennée wrote: >> Dmitry Osipenko <dmitry.osipenko@collabora.com> writes: >> >>> This patchset adds DRM native context support to VirtIO-GPU on Qemu. >>> >>> Contarary to Virgl and Venus contexts that mediates high level GFX APIs, >>> DRM native context [1] mediates lower level kernel driver UAPI, which >>> reflects in a less CPU overhead and less/simpler code needed to support it. >>> DRM context consists of a host and guest parts that have to be implemented >>> for each GPU driver. On a guest side, DRM context presents a virtual GPU as >>> a real/native host GPU device for GL/VK applications. >>> >>> [1] https://www.youtube.com/watch?v=9sFP_yddLLQ >>> >>> Today there are four DRM native context drivers existing in a wild: >>> >>> - Freedreno (Qualcomm SoC GPUs), completely upstreamed >>> - AMDGPU, completely upstreamed >> >> Well good news and bad news. >> >> I can verify that AMD native context works when I run my Aarch64 guest >> on my Aarch64 host with -accel TCG (therefor avoiding KVM all together). >> I get potato frame rates though (~150FPS) although I suspect that is >> because the PCI errata workaround. >> >> When it comes to graphics memory allocation is there anything I can do >> to force all allocations to be very aligned? Is this in the purview of >> the AMD drm drivers or TTM itself? > > All GPU allocations should be aligned to a page size. Alignment is > specified by AMD driver. I don't expect that alignment is the problem. > What's the size of your host and guest pages? 4k AFAIK. -- Alex Bennée Virtualisation Tech Lead @ Linaro
© 2016 - 2025 Red Hat, Inc.