docs/system/devices/virtio-gpu.rst | 105 +++++++++++++++++-- hw/display/virtio-gpu-gl.c | 5 + hw/display/virtio-gpu-virgl.c | 159 ++++++++++++++++++++++++++++- hw/display/virtio-gpu.c | 15 +++ include/hw/virtio/virtio-gpu.h | 16 +++ include/ui/sdl2.h | 7 ++ meson.build | 6 +- ui/gtk-egl.c | 1 - ui/gtk-gl-area.c | 1 - ui/sdl2-gl.c | 68 +++++++++++- ui/sdl2.c | 42 ++++++++ 11 files changed, 411 insertions(+), 14 deletions(-)
This patchset adds DRM native context support to VirtIO-GPU on Qemu. Contarary to Virgl and Venus contexts that mediates high level GFX APIs, DRM native context [1] mediates lower level kernel driver UAPI, which reflects in a less CPU overhead and less/simpler code needed to support it. DRM context consists of a host and guest parts that have to be implemented for each GPU driver. On a guest side, DRM context presents a virtual GPU as a real/native host GPU device for GL/VK applications. [1] https://www.youtube.com/watch?v=9sFP_yddLLQ Today there are four known DRM native context drivers existing in a wild: - Freedreno (Qualcomm SoC GPUs), completely upstreamed - AMDGPU, mostly merged into upstreams - Intel (i915), merge requests are opened - Asahi (Apple SoC GPUs), WIP status # How to try out DRM context: 1. DRM context uses host blobs and requires latest developer version of Linux kernel [2] that has necessary KVM fixes. [2] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/ 2. Use latest libvirglrenderer from upstream git/main for Freedreno and AMDGPU native contexts. For Intel use patches [3]. [3] https://gitlab.freedesktop.org/virgl/virglrenderer/-/merge_requests/1384 3. On guest, use latest Mesa version for Freedreno. For AMDGPU use Mesa patches [4], for Intel [5]. [4] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21658 [5] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29870 4. On guest, use latest Linux kernel v6.6+. Apply patch [6] if you're running Xorg in guest. [6] https://lore.kernel.org/dri-devel/20241020224725.179937-1-dmitry.osipenko@collabora.com/ Example Qemu cmdline that enables DRM context: qemu-system-x86_64 -device virtio-vga-gl,hostmem=4G,blob=on,drm_native_context=on \ -machine q35,accel=kvm,memory-backend=mem1 \ -object memory-backend-memfd,id=mem1,size=8G -m 8G # Note about known performance problem in Qemu: DRM contexts are mapping host blobs extensively and these mapping operations work slowly in Qemu. Exact reason is unknown. Mappings work fast on Crosvm For DRM contexts this problem is more visible than for Venus/Virgl. Changelog: v5: - Added r-bs from Akihiko Odaki. - Added acks from Michael Tsirkin. - Fixed compilation warning using older version of virglrenderer that was reported by Alex Bennée. Noticed that I need to keep old virgl_write_fence() code around for the older virglrenderer in "Support asynchronous fencing" patch, so added it back and verified that old virglrenderer works properly. - Added new patch from Alex Bennée that adds more virtio-gpu documentation with a couple corrections and additions to it from me. - Rebased patches on top of latest staging tree. v4: - Improved SDL2/dmabuf patch by reusing existing Meson X11 config option, better handling EGL error and extending comment telling that it's safe to enable SDL2 EGL preference hint. As was suggested by Akihiko Odaki. - Replaced another QSLIST_FOREACH_SAFE with QSLIST_EMPTY+FIRST in the async-fencing patch for more consistency of the code. As was suggested by Akihiko Odaki. - Added missing braces around if-statement that was spotted by Alex Bennée. - Renamed 'drm=on' option of virtio-gpu-gl device to 'drm_native_context=on' for more clarity as was suggested by Alex Bennée. Haven't added added new context-type option that was also proposed by Alex, might do it with a separate patch. This context-type option will duplicate and depecate existing options, but in a longer run likely will be worthwhile adding it. - Dropped Linux headers-update patch as headers has been updated in the staging tree. v3: - Improved EGL presence-check code on X11 systems for the SDL2 hint that prefers EGL over GLX by using better ifdefs and checking Xlib presence at a build time to avoid build failure if lib SDL2 and system are configured with a disabled X11 support. Also added clarifying comment telling that X11 hint doesn't affect Wayland systems. Suggested by Akihiko Odaki. - Corrected strerror(err) that used negative error where it should be positive and vice versa that was caught by Akihiko Odaki. Added clarifying comment for the case where we get positive error code from virglrenderer that differs from other virglrenderer API functions. - Improved QSLIST usage by dropping mutex protecting the async fence list and using atomic variant of QSLIST helpers instead. Switched away from using FOREACH helper to improve readability of the code, showing that we don't precess list in unoptimal way. Like was suggested by Akihiko Odaki. - Updated patchset base to Venus v18. v2: - Updated SDL2-dmabuf patch by making use of error_report() and checking presense of X11+EGL in the system before making SDL2 to prefer EGL backend over GLX, suggested by Akihiko Odaki. - Improved SDL2's dmabuf-presence check that wasn't done properly in v1, where EGL was set up only after first console was fully inited, and thus, SDL's display .has_dmabuf callback didn't work for the first console. Now dmabuf support status is pre-checked before console is registered. - Updated commit description of the patch that fixes SDL2's context switching logic with a more detailed explanation of the problem. Suggested by Akihiko Odaki. - Corrected rebase typo in the async-fencing patch and switched async-fencing to use a sigle-linked list instead of the double, as was suggested by Akihiko Odaki. - Replaced "=true" with "=on" in the DRM native context documentation example and made virtio_gpu_virgl_init() to fail with a error message if DRM context can't be initialized instead of giving a warning message, as was suggested by Akihiko Odaki. - Added patchew's dependecy tag to the cover letter as was suggested by Akihiko Odaki. Alex Bennée (1): docs/system: Expand the virtio-gpu documentation Dmitry Osipenko (6): ui/sdl2: Restore original context after new context creation virtio-gpu: Handle virgl fence creation errors virtio-gpu: Support asynchronous fencing virtio-gpu: Support DRM native context ui/sdl2: Don't disable scanout when display is refreshed ui/gtk: Don't disable scanout when display is refreshed Pierre-Eric Pelloux-Prayer (1): ui/sdl2: Implement dpy dmabuf functions docs/system/devices/virtio-gpu.rst | 105 +++++++++++++++++-- hw/display/virtio-gpu-gl.c | 5 + hw/display/virtio-gpu-virgl.c | 159 ++++++++++++++++++++++++++++- hw/display/virtio-gpu.c | 15 +++ include/hw/virtio/virtio-gpu.h | 16 +++ include/ui/sdl2.h | 7 ++ meson.build | 6 +- ui/gtk-egl.c | 1 - ui/gtk-gl-area.c | 1 - ui/sdl2-gl.c | 68 +++++++++++- ui/sdl2.c | 42 ++++++++ 11 files changed, 411 insertions(+), 14 deletions(-) -- 2.47.1
Dmitry Osipenko <dmitry.osipenko@collabora.com> writes: > This patchset adds DRM native context support to VirtIO-GPU on Qemu. > > Contarary to Virgl and Venus contexts that mediates high level GFX APIs, > DRM native context [1] mediates lower level kernel driver UAPI, which > reflects in a less CPU overhead and less/simpler code needed to support it. > DRM context consists of a host and guest parts that have to be implemented > for each GPU driver. On a guest side, DRM context presents a virtual GPU as > a real/native host GPU device for GL/VK applications. > > [1] https://www.youtube.com/watch?v=9sFP_yddLLQ > > Today there are four known DRM native context drivers existing in a wild: > > - Freedreno (Qualcomm SoC GPUs), completely upstreamed > - AMDGPU, mostly merged into upstreams I tried my AMD system today with: Host: Aarch64 AVA system Trixie virglrenderer @ v1.1.0/99557f5aa130930d11f04ffeb07f3a9aa5963182 -display sdl,gl=on (gtk,gl=on also came up but handled window resizing poorly) KVM Guest Aarch64 Trixie mesa @ main/d27748a76f7dd9236bfcf9ef172dc13b8c0e170f -Dvulkan-drivers=virtio,amd -Dgallium-drivers=virgl,radeonsi -Damdgpu-virtio=true However when I ran vulkan-info --summary KVM faulted with: debian-trixie login: error: kvm run failed Bad address PC=0000ffffb9aa1eb0 X00=0000ffffba0450a4 X01=0000aaaaf7f32400 X02=000000000000013c X03=0000ffffba045098 X04=0000aaaaf7f3253c X05=0000ffffba0451d4 X06=00000000c0016900 X07=000000000000000e X08=0000000000000014 X09=00000000000000ff X10=0000aaaaf7f32500 X11=0000aaaaf7e4d028 X12=0000aaaaf7edbcb0 X13=0000000000000001 X14=000000000000000c X15=0000000000007718 X16=0000ffffb93601f0 X17=0000ffffb9aa1dc0 X18=00000000000076f0 X19=0000aaaaf7f31330 X20=0000aaaaf7f323f0 X21=0000aaaaf7f235e0 X22=000000000000004c X23=0000aaaaf7f2b5e0 X24=0000aaaaf7ee0cb0 X25=00000000000000ff X26=0000000000000076 X27=0000ffffcd2b18a8 X28=0000aaaaf7ee0cb0 X29=0000ffffcd2b0bd0 X30=0000ffffb86c8b98 SP=0000ffffcd2b0bd0 PSTATE=20001000 --C- EL0t QEMU 9.2.50 monitor - type 'help' for more information (qemu) quit Which looks very much like the PFN locking failure. However booting up with venus=on instead works. Could there be any differences in the way device memory is mapped in the two cases? > - Intel (i915), merge requests are opened > - Asahi (Apple SoC GPUs), WIP status > <snip> -- Alex Bennée Virtualisation Tech Lead @ Linaro
On 1/22/25 20:00, Alex Bennée wrote: > Dmitry Osipenko <dmitry.osipenko@collabora.com> writes: > >> This patchset adds DRM native context support to VirtIO-GPU on Qemu. >> >> Contarary to Virgl and Venus contexts that mediates high level GFX APIs, >> DRM native context [1] mediates lower level kernel driver UAPI, which >> reflects in a less CPU overhead and less/simpler code needed to support it. >> DRM context consists of a host and guest parts that have to be implemented >> for each GPU driver. On a guest side, DRM context presents a virtual GPU as >> a real/native host GPU device for GL/VK applications. >> >> [1] https://www.youtube.com/watch?v=9sFP_yddLLQ >> >> Today there are four known DRM native context drivers existing in a wild: >> >> - Freedreno (Qualcomm SoC GPUs), completely upstreamed >> - AMDGPU, mostly merged into upstreams > > I tried my AMD system today with: > > Host: > Aarch64 AVA system > Trixie > virglrenderer @ v1.1.0/99557f5aa130930d11f04ffeb07f3a9aa5963182 > -display sdl,gl=on (gtk,gl=on also came up but handled window resizing > poorly) > > KVM Guest > > Aarch64 > Trixie > mesa @ main/d27748a76f7dd9236bfcf9ef172dc13b8c0e170f > -Dvulkan-drivers=virtio,amd -Dgallium-drivers=virgl,radeonsi -Damdgpu-virtio=true > > However when I ran vulkan-info --summary KVM faulted with: > > debian-trixie login: error: kvm run failed Bad address > PC=0000ffffb9aa1eb0 X00=0000ffffba0450a4 X01=0000aaaaf7f32400 > X02=000000000000013c X03=0000ffffba045098 X04=0000aaaaf7f3253c > X05=0000ffffba0451d4 X06=00000000c0016900 X07=000000000000000e > X08=0000000000000014 X09=00000000000000ff X10=0000aaaaf7f32500 > X11=0000aaaaf7e4d028 X12=0000aaaaf7edbcb0 X13=0000000000000001 > X14=000000000000000c X15=0000000000007718 X16=0000ffffb93601f0 > X17=0000ffffb9aa1dc0 X18=00000000000076f0 X19=0000aaaaf7f31330 > X20=0000aaaaf7f323f0 X21=0000aaaaf7f235e0 X22=000000000000004c > X23=0000aaaaf7f2b5e0 X24=0000aaaaf7ee0cb0 X25=00000000000000ff > X26=0000000000000076 X27=0000ffffcd2b18a8 X28=0000aaaaf7ee0cb0 > X29=0000ffffcd2b0bd0 X30=0000ffffb86c8b98 SP=0000ffffcd2b0bd0 > PSTATE=20001000 --C- EL0t > QEMU 9.2.50 monitor - type 'help' for more information > (qemu) quit > > Which looks very much like the PFN locking failure. However booting up > with venus=on instead works. Could there be any differences in the way > device memory is mapped in the two cases? Memory mapping works exactly the same for nctx and venus. Are you on 6.13 host kernel? -- Best regards, Dmitry
Dmitry Osipenko <dmitry.osipenko@collabora.com> writes: > On 1/22/25 20:00, Alex Bennée wrote: >> Dmitry Osipenko <dmitry.osipenko@collabora.com> writes: >> >>> This patchset adds DRM native context support to VirtIO-GPU on Qemu. >>> >>> Contarary to Virgl and Venus contexts that mediates high level GFX APIs, >>> DRM native context [1] mediates lower level kernel driver UAPI, which >>> reflects in a less CPU overhead and less/simpler code needed to support it. >>> DRM context consists of a host and guest parts that have to be implemented >>> for each GPU driver. On a guest side, DRM context presents a virtual GPU as >>> a real/native host GPU device for GL/VK applications. >>> >>> [1] https://www.youtube.com/watch?v=9sFP_yddLLQ >>> >>> Today there are four known DRM native context drivers existing in a wild: >>> >>> - Freedreno (Qualcomm SoC GPUs), completely upstreamed >>> - AMDGPU, mostly merged into upstreams >> >> I tried my AMD system today with: >> >> Host: >> Aarch64 AVA system >> Trixie >> virglrenderer @ v1.1.0/99557f5aa130930d11f04ffeb07f3a9aa5963182 >> -display sdl,gl=on (gtk,gl=on also came up but handled window resizing >> poorly) >> >> KVM Guest >> >> Aarch64 >> Trixie >> mesa @ main/d27748a76f7dd9236bfcf9ef172dc13b8c0e170f >> -Dvulkan-drivers=virtio,amd -Dgallium-drivers=virgl,radeonsi -Damdgpu-virtio=true >> >> However when I ran vulkan-info --summary KVM faulted with: >> >> debian-trixie login: error: kvm run failed Bad address >> PC=0000ffffb9aa1eb0 X00=0000ffffba0450a4 X01=0000aaaaf7f32400 >> X02=000000000000013c X03=0000ffffba045098 X04=0000aaaaf7f3253c >> X05=0000ffffba0451d4 X06=00000000c0016900 X07=000000000000000e >> X08=0000000000000014 X09=00000000000000ff X10=0000aaaaf7f32500 >> X11=0000aaaaf7e4d028 X12=0000aaaaf7edbcb0 X13=0000000000000001 >> X14=000000000000000c X15=0000000000007718 X16=0000ffffb93601f0 >> X17=0000ffffb9aa1dc0 X18=00000000000076f0 X19=0000aaaaf7f31330 >> X20=0000aaaaf7f323f0 X21=0000aaaaf7f235e0 X22=000000000000004c >> X23=0000aaaaf7f2b5e0 X24=0000aaaaf7ee0cb0 X25=00000000000000ff >> X26=0000000000000076 X27=0000ffffcd2b18a8 X28=0000aaaaf7ee0cb0 >> X29=0000ffffcd2b0bd0 X30=0000ffffb86c8b98 SP=0000ffffcd2b0bd0 >> PSTATE=20001000 --C- EL0t >> QEMU 9.2.50 monitor - type 'help' for more information >> (qemu) quit >> >> Which looks very much like the PFN locking failure. However booting up >> with venus=on instead works. Could there be any differences in the way >> device memory is mapped in the two cases? > > Memory mapping works exactly the same for nctx and venus. Are you on > 6.13 host kernel? Yes - with the Altra PCI workaround patches on both host and guest kernel. Is there anyway to trace the sharing of device memory on the host so I can verify its an attempt at device access? The PC looks like its in user-space but once this fails the guest is suspended so I can't poke around in its environment. -- Alex Bennée Virtualisation Tech Lead @ Linaro
On 1/23/25 14:58, Alex Bennée wrote: > Dmitry Osipenko <dmitry.osipenko@collabora.com> writes: > >> On 1/22/25 20:00, Alex Bennée wrote: >>> Dmitry Osipenko <dmitry.osipenko@collabora.com> writes: >>> >>>> This patchset adds DRM native context support to VirtIO-GPU on Qemu. >>>> >>>> Contarary to Virgl and Venus contexts that mediates high level GFX APIs, >>>> DRM native context [1] mediates lower level kernel driver UAPI, which >>>> reflects in a less CPU overhead and less/simpler code needed to support it. >>>> DRM context consists of a host and guest parts that have to be implemented >>>> for each GPU driver. On a guest side, DRM context presents a virtual GPU as >>>> a real/native host GPU device for GL/VK applications. >>>> >>>> [1] https://www.youtube.com/watch?v=9sFP_yddLLQ >>>> >>>> Today there are four known DRM native context drivers existing in a wild: >>>> >>>> - Freedreno (Qualcomm SoC GPUs), completely upstreamed >>>> - AMDGPU, mostly merged into upstreams >>> >>> I tried my AMD system today with: >>> >>> Host: >>> Aarch64 AVA system >>> Trixie >>> virglrenderer @ v1.1.0/99557f5aa130930d11f04ffeb07f3a9aa5963182 >>> -display sdl,gl=on (gtk,gl=on also came up but handled window resizing >>> poorly) >>> >>> KVM Guest >>> >>> Aarch64 >>> Trixie >>> mesa @ main/d27748a76f7dd9236bfcf9ef172dc13b8c0e170f >>> -Dvulkan-drivers=virtio,amd -Dgallium-drivers=virgl,radeonsi -Damdgpu-virtio=true >>> >>> However when I ran vulkan-info --summary KVM faulted with: >>> >>> debian-trixie login: error: kvm run failed Bad address >>> PC=0000ffffb9aa1eb0 X00=0000ffffba0450a4 X01=0000aaaaf7f32400 >>> X02=000000000000013c X03=0000ffffba045098 X04=0000aaaaf7f3253c >>> X05=0000ffffba0451d4 X06=00000000c0016900 X07=000000000000000e >>> X08=0000000000000014 X09=00000000000000ff X10=0000aaaaf7f32500 >>> X11=0000aaaaf7e4d028 X12=0000aaaaf7edbcb0 X13=0000000000000001 >>> X14=000000000000000c X15=0000000000007718 X16=0000ffffb93601f0 >>> X17=0000ffffb9aa1dc0 X18=00000000000076f0 X19=0000aaaaf7f31330 >>> X20=0000aaaaf7f323f0 X21=0000aaaaf7f235e0 X22=000000000000004c >>> X23=0000aaaaf7f2b5e0 X24=0000aaaaf7ee0cb0 X25=00000000000000ff >>> X26=0000000000000076 X27=0000ffffcd2b18a8 X28=0000aaaaf7ee0cb0 >>> X29=0000ffffcd2b0bd0 X30=0000ffffb86c8b98 SP=0000ffffcd2b0bd0 >>> PSTATE=20001000 --C- EL0t >>> QEMU 9.2.50 monitor - type 'help' for more information >>> (qemu) quit >>> >>> Which looks very much like the PFN locking failure. However booting up >>> with venus=on instead works. Could there be any differences in the way >>> device memory is mapped in the two cases? >> >> Memory mapping works exactly the same for nctx and venus. Are you on >> 6.13 host kernel? > > Yes - with the Altra PCI workaround patches on both host and guest > kernel. > > Is there anyway to trace the sharing of device memory on the host so I > can verify its an attempt at device access? The PC looks like its in > user-space but once this fails the guest is suspended so I can't poke > around in its environment. I'm adding printk's to kernel in a such cases. Likely there is no other better way to find why it fails. Does your ARM VM and host both use 4k page size? Well, if it's a page refcounting bug on ARM/KMV, then applying [1] to the host driver will make it work and we will know where the problem is. Please try. [1] https://patchwork.kernel.org/project/kvm/patch/20220815095423.11131-1-dmitry.osipenko@collabora.com/ -- Best regards, Dmitry
Dmitry Osipenko <dmitry.osipenko@collabora.com> writes: > On 1/23/25 14:58, Alex Bennée wrote: >> Dmitry Osipenko <dmitry.osipenko@collabora.com> writes: >> >>> On 1/22/25 20:00, Alex Bennée wrote: >>>> Dmitry Osipenko <dmitry.osipenko@collabora.com> writes: >>>> >>>>> This patchset adds DRM native context support to VirtIO-GPU on Qemu. >>>>> >>>>> Contarary to Virgl and Venus contexts that mediates high level GFX APIs, >>>>> DRM native context [1] mediates lower level kernel driver UAPI, which >>>>> reflects in a less CPU overhead and less/simpler code needed to support it. >>>>> DRM context consists of a host and guest parts that have to be implemented >>>>> for each GPU driver. On a guest side, DRM context presents a virtual GPU as >>>>> a real/native host GPU device for GL/VK applications. >>>>> >>>>> [1] https://www.youtube.com/watch?v=9sFP_yddLLQ >>>>> >>>>> Today there are four known DRM native context drivers existing in a wild: >>>>> >>>>> - Freedreno (Qualcomm SoC GPUs), completely upstreamed >>>>> - AMDGPU, mostly merged into upstreams >>>> >>>> I tried my AMD system today with: >>>> >>>> Host: >>>> Aarch64 AVA system >>>> Trixie >>>> virglrenderer @ v1.1.0/99557f5aa130930d11f04ffeb07f3a9aa5963182 >>>> -display sdl,gl=on (gtk,gl=on also came up but handled window resizing >>>> poorly) >>>> >>>> KVM Guest >>>> >>>> Aarch64 >>>> Trixie >>>> mesa @ main/d27748a76f7dd9236bfcf9ef172dc13b8c0e170f >>>> -Dvulkan-drivers=virtio,amd -Dgallium-drivers=virgl,radeonsi -Damdgpu-virtio=true >>>> >>>> However when I ran vulkan-info --summary KVM faulted with: >>>> >>>> debian-trixie login: error: kvm run failed Bad address >>>> PC=0000ffffb9aa1eb0 X00=0000ffffba0450a4 X01=0000aaaaf7f32400 >>>> X02=000000000000013c X03=0000ffffba045098 X04=0000aaaaf7f3253c >>>> X05=0000ffffba0451d4 X06=00000000c0016900 X07=000000000000000e >>>> X08=0000000000000014 X09=00000000000000ff X10=0000aaaaf7f32500 >>>> X11=0000aaaaf7e4d028 X12=0000aaaaf7edbcb0 X13=0000000000000001 >>>> X14=000000000000000c X15=0000000000007718 X16=0000ffffb93601f0 >>>> X17=0000ffffb9aa1dc0 X18=00000000000076f0 X19=0000aaaaf7f31330 >>>> X20=0000aaaaf7f323f0 X21=0000aaaaf7f235e0 X22=000000000000004c >>>> X23=0000aaaaf7f2b5e0 X24=0000aaaaf7ee0cb0 X25=00000000000000ff >>>> X26=0000000000000076 X27=0000ffffcd2b18a8 X28=0000aaaaf7ee0cb0 >>>> X29=0000ffffcd2b0bd0 X30=0000ffffb86c8b98 SP=0000ffffcd2b0bd0 >>>> PSTATE=20001000 --C- EL0t >>>> QEMU 9.2.50 monitor - type 'help' for more information >>>> (qemu) quit >>>> >>>> Which looks very much like the PFN locking failure. However booting up >>>> with venus=on instead works. Could there be any differences in the way >>>> device memory is mapped in the two cases? >>> >>> Memory mapping works exactly the same for nctx and venus. Are you on >>> 6.13 host kernel? >> >> Yes - with the Altra PCI workaround patches on both host and guest >> kernel. >> >> Is there anyway to trace the sharing of device memory on the host so I >> can verify its an attempt at device access? The PC looks like its in >> user-space but once this fails the guest is suspended so I can't poke >> around in its environment. > > I'm adding printk's to kernel in a such cases. Likely there is no other > better way to find why it fails. > > Does your ARM VM and host both use 4k page size? > > Well, if it's a page refcounting bug on ARM/KMV, then applying [1] to > the host driver will make it work and we will know where the problem is. > Please try. > > [1] > https://patchwork.kernel.org/project/kvm/patch/20220815095423.11131-1-dmitry.osipenko@collabora.com/ That makes no difference. AFAICT the fault is triggered in userspace: error: kvm run failed Bad address PC=0000ffffb1911eb0 X00=0000ffffb1eb60a4 X01=0000aaaaeb1f5400 X02=000000000000013c X03=0000ffffb1eb6098 X04=0000aaaaeb1f553c X05=0000ffffb1eb61d4 X06=00000000c0016900 X07=000000000000000e X08=0000000000000014 X09=00000000000000ff X10=0000aaaaeb1f5500 X11=0000aaaaeb110028 X12=0000aaaaeb19ecb0 X13=0000000000000001 X14=000000000000000c X15=0000000000007718 X16=0000ffffb11d01f0 X17=0000ffffb1911dc0 X18=00000000000076f0 X19=0000aaaaeb1f4330 X20=0000aaaaeb1f53f0 X21=0000aaaaeb1e65e0 X22=000000000000004c X23=0000aaaaeb1ee5e0 X24=0000aaaaeb1a3cb0 X25=00000000000000ff X26=0000000000000076 X27=0000ffffc7db4e58 X28=0000aaaaeb1a3cb0 X29=0000ffffc7db4180 X30=0000ffffb0538b98 SP=0000ffffc7db4180 PSTATE=20001000 --C- EL0t QEMU 9.2.50 monitor - type 'help' for more information (qemu) quit Thread 4 received signal SIGABRT, Aborted. [Switching to Thread 1.4] cpu_do_idle () at /home/alex/lsrc/linux.git/arch/arm64/kernel/idle.c:32 32 arm_cpuidle_restore_irq_context(&context); (gdb) alex Undefined command: "alex". Try "help". (gdb) bt #0 cpu_do_idle () at /home/alex/lsrc/linux.git/arch/arm64/kernel/idle.c:32 #1 0xffff800081962180 in arch_cpu_idle () at /home/alex/lsrc/linux.git/arch/arm64/kernel/idle.c:44 #2 0xffff8000819622c4 in default_idle_call () at /home/alex/lsrc/linux.git/kernel/sched/idle.c:117 #3 0xffff80008013af8c in cpuidle_idle_call () at /home/alex/lsrc/linux.git/kernel/sched/idle.c:185 #4 do_idle () at /home/alex/lsrc/linux.git/kernel/sched/idle.c:325 #5 0xffff80008013b208 in cpu_startup_entry (state=state@entry=CPUHP_AP_ONLINE_IDLE) at /home/alex/lsrc/linux.git/kernel/sched/idle.c:423 #6 0xffff800080043668 in secondary_start_kernel () at /home/alex/lsrc/linux.git/arch/arm64/kernel/smp.c:279 #7 0xffff800080051f78 in __secondary_switched () at /home/alex/lsrc/linux.git/arch/arm64/kernel/head.S:420 Backtrace stopped: previous frame identical to this frame (corrupt stack?) (gdb) info threads Id Target Id Frame 1 Thread 1.1 (CPU#0 [running]) cpu_do_idle () at /home/alex/lsrc/linux.git/arch/arm64/kernel/idle.c:32 2 Thread 1.2 (CPU#1 [halted ]) 0x0000ffffb1911eb0 in ?? () 3 Thread 1.3 (CPU#2 [halted ]) cpu_do_idle () at /home/alex/lsrc/linux.git/arch/arm64/kernel/idle.c:32 * 4 Thread 1.4 (CPU#3 [halted ]) cpu_do_idle () at /home/alex/lsrc/linux.git/arch/arm64/kernel/idle.c:32 (gdb) thread 2 [Switching to thread 2 (Thread 1.2)] #0 0x0000ffffb1911eb0 in ?? () (gdb) bt #0 0x0000ffffb1911eb0 in ?? () #1 0x0000aaaaeb1ea5e0 in ?? () Backtrace stopped: previous frame inner to this frame (corrupt stack?) (gdb) frame 0 #0 0x0000ffffb1911eb0 in ?? () (gdb) x/5i $pc => 0xffffb1911eb0: str q3, [x0] 0xffffb1911eb4: ldp q2, q3, [x1, #48] 0xffffb1911eb8: subs x2, x2, #0x90 0xffffb1911ebc: b.ls 0xffffb1911ee0 // b.plast 0xffffb1911ec0: stp q0, q1, [x3, #16] (gdb) p/x $x0 $1 = 0xffffb1eb60a4 I suspect that is memcpy again but I'll try and track it down. The only other note is: [ 411.509647] kvm [7713]: Unsupported FSC: EC=0x24 xFSC=0x21 ESR_EL2=0x92000061 Which is: EC 0x24 - Data Abort from lower EL DFSC 0x21 - Alignment fault WnR 1 - Caused by write -- Alex Bennée Virtualisation Tech Lead @ Linaro
© 2016 - 2025 Red Hat, Inc.