[PATCH 0/5] virtio-gpu: Force RCU when unmapping blob

Akihiko Odaki posted 5 patches 2 weeks, 2 days ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20251029-force._5Frcu-v1-0-bf860a6277a6@rsg.ci.i.u-tokyo.ac.jp
Maintainers: "Alex Bennée" <alex.bennee@linaro.org>, Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>, Dmitry Osipenko <dmitry.osipenko@collabora.com>, "Michael S. Tsirkin" <mst@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>
include/qemu/futex.h          |  29 ++++++--
include/qemu/rcu.h            |   1 +
include/qemu/thread-posix.h   |  11 +++
include/qemu/thread.h         |   8 ++-
hw/display/virtio-gpu-virgl.c |   1 +
util/event.c                  |  34 ++++++++--
util/qemu-thread-posix.c      |  11 +--
util/rcu.c                    | 153 ++++++++++++++++++++++++------------------
8 files changed, 163 insertions(+), 85 deletions(-)
[PATCH 0/5] virtio-gpu: Force RCU when unmapping blob
Posted by Akihiko Odaki 2 weeks, 2 days ago
Based-on: <20251016-force-v1-1-919a82112498@rsg.ci.i.u-tokyo.ac.jp>
("[PATCH] rcu: Unify force quiescent state")

Unmapping a blob changes the memory map, which is protected with RCU.
RCU is designed to minimize the read-side overhead at the cost of
reclamation delay. While this design usually makes sense, it is
problematic when unmapping a blob because the operation blocks all
virtio-gpu commands and causes perceivable disruption.

Minimize such the disruption with force_rcu(), which minimizes the
reclamation delay at the cost of a read-side overhead.

Dmitry, can you see if this change makes difference?

Signed-off-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
---
Akihiko Odaki (5):
      futex: Add qemu_futex_timedwait()
      qemu-thread: Add qemu_event_timedwait()
      rcu: Use call_rcu() in synchronize_rcu()
      rcu: Wake the RCU thread when draining
      virtio-gpu: Force RCU when unmapping blob

 include/qemu/futex.h          |  29 ++++++--
 include/qemu/rcu.h            |   1 +
 include/qemu/thread-posix.h   |  11 +++
 include/qemu/thread.h         |   8 ++-
 hw/display/virtio-gpu-virgl.c |   1 +
 util/event.c                  |  34 ++++++++--
 util/qemu-thread-posix.c      |  11 +--
 util/rcu.c                    | 153 ++++++++++++++++++++++++------------------
 8 files changed, 163 insertions(+), 85 deletions(-)
---
base-commit: ee7fbe81705732785aef2cb568bbc5d8f7d2fce1
change-id: 20251027-force_rcu-616c743373f7

Best regards,
--  
Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Re: [PATCH 0/5] virtio-gpu: Force RCU when unmapping blob
Posted by Dmitry Osipenko 2 weeks, 1 day ago
Hello Akihiko,

On 10/29/25 09:12, Akihiko Odaki wrote:
> Based-on: <20251016-force-v1-1-919a82112498@rsg.ci.i.u-tokyo.ac.jp>
> ("[PATCH] rcu: Unify force quiescent state")
> 
> Unmapping a blob changes the memory map, which is protected with RCU.
> RCU is designed to minimize the read-side overhead at the cost of
> reclamation delay. While this design usually makes sense, it is
> problematic when unmapping a blob because the operation blocks all
> virtio-gpu commands and causes perceivable disruption.
> 
> Minimize such the disruption with force_rcu(), which minimizes the
> reclamation delay at the cost of a read-side overhead.
> 
> Dmitry, can you see if this change makes difference?

Tested this series with venus and native contexts.

The improvement is very noticeable. There are almost no stalls with
venus and much less stalls with native context. The stall now takes
2-10ms at max in oppose to 50ms that was observed previously. No
stability issues spotted, everything works.

Thank you for working on this improvement.

-- 
Best regards,
Dmitry
Re: [PATCH 0/5] virtio-gpu: Force RCU when unmapping blob
Posted by Alex Bennée 2 weeks ago
Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp> writes:

> Based-on: <20251016-force-v1-1-919a82112498@rsg.ci.i.u-tokyo.ac.jp>
> ("[PATCH] rcu: Unify force quiescent state")
>
> Unmapping a blob changes the memory map, which is protected with RCU.
> RCU is designed to minimize the read-side overhead at the cost of
> reclamation delay. While this design usually makes sense, it is
> problematic when unmapping a blob because the operation blocks all
> virtio-gpu commands and causes perceivable disruption.
>
> Minimize such the disruption with force_rcu(), which minimizes the
> reclamation delay at the cost of a read-side overhead.
>
> Dmitry, can you see if this change makes difference?

Also works with the blob test:

  ➜  ./pyvenv/bin/meson test --setup thorough func-aarch64-gpu_blob
  ninja: Entering directory `/home/alex/lsrc/qemu.git/builds/all'
  [1/6] Generating qemu-version.h with a custom command (wrapped by meson to capture output)
  1/1 qemu:func-thorough+func-aarch64-thorough+thorough / func-aarch64-gpu_blob        OK              0.37s   1 subtests passed

  Ok:                 1   
  Expected Fail:      0   
  Fail:               0   
  Unexpected Pass:    0   
  Skipped:            0   
  Timeout:            0   

  Full log written to /home/alex/lsrc/qemu.git/builds/all/meson-logs/testlog-thorough.txt
  🕙17:57:38 alex@draig:qemu.git/builds/all  on  virtio-gpu/next [$!?] 

so a Tested-by: Alex Bennée <alex.bennee@linaro.org> from me.


>
> Signed-off-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
> ---
> Akihiko Odaki (5):
>       futex: Add qemu_futex_timedwait()
>       qemu-thread: Add qemu_event_timedwait()
>       rcu: Use call_rcu() in synchronize_rcu()
>       rcu: Wake the RCU thread when draining
>       virtio-gpu: Force RCU when unmapping blob
>
>  include/qemu/futex.h          |  29 ++++++--
>  include/qemu/rcu.h            |   1 +
>  include/qemu/thread-posix.h   |  11 +++
>  include/qemu/thread.h         |   8 ++-
>  hw/display/virtio-gpu-virgl.c |   1 +
>  util/event.c                  |  34 ++++++++--
>  util/qemu-thread-posix.c      |  11 +--
>  util/rcu.c                    | 153 ++++++++++++++++++++++++------------------
>  8 files changed, 163 insertions(+), 85 deletions(-)
> ---
> base-commit: ee7fbe81705732785aef2cb568bbc5d8f7d2fce1
> change-id: 20251027-force_rcu-616c743373f7
>
> Best regards,

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro
Re: [PATCH 0/5] virtio-gpu: Force RCU when unmapping blob
Posted by Alex Bennée 1 week, 6 days ago
Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp> writes:

> Based-on: <20251016-force-v1-1-919a82112498@rsg.ci.i.u-tokyo.ac.jp>
> ("[PATCH] rcu: Unify force quiescent state")
>
> Unmapping a blob changes the memory map, which is protected with RCU.
> RCU is designed to minimize the read-side overhead at the cost of
> reclamation delay. While this design usually makes sense, it is
> problematic when unmapping a blob because the operation blocks all
> virtio-gpu commands and causes perceivable disruption.
>
> Minimize such the disruption with force_rcu(), which minimizes the
> reclamation delay at the cost of a read-side overhead.
>
> Dmitry, can you see if this change makes difference?
>
> Signed-off-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>

Are you planning a re-spin now the rcu patch is merged? If the rcu
maintainers are happy I'm fine to take it via virtio-gpu/next with the
testcase.

> ---
> Akihiko Odaki (5):
>       futex: Add qemu_futex_timedwait()
>       qemu-thread: Add qemu_event_timedwait()
>       rcu: Use call_rcu() in synchronize_rcu()
>       rcu: Wake the RCU thread when draining
>       virtio-gpu: Force RCU when unmapping blob
>
>  include/qemu/futex.h          |  29 ++++++--
>  include/qemu/rcu.h            |   1 +
>  include/qemu/thread-posix.h   |  11 +++
>  include/qemu/thread.h         |   8 ++-
>  hw/display/virtio-gpu-virgl.c |   1 +
>  util/event.c                  |  34 ++++++++--
>  util/qemu-thread-posix.c      |  11 +--
>  util/rcu.c                    | 153 ++++++++++++++++++++++++------------------
>  8 files changed, 163 insertions(+), 85 deletions(-)
> ---
> base-commit: ee7fbe81705732785aef2cb568bbc5d8f7d2fce1
> change-id: 20251027-force_rcu-616c743373f7
>
> Best regards,

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro