[PATCH v3 0/8] accel/tcg: Rewrite user-only vma tracking

Richard Henderson posted 8 patches 1 year, 4 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20221209051914.398215-1-richard.henderson@linaro.org
Maintainers: Richard Henderson <richard.henderson@linaro.org>, Paolo Bonzini <pbonzini@redhat.com>, Riku Voipio <riku.voipio@iki.fi>, "Alex Bennée" <alex.bennee@linaro.org>
accel/tcg/internal.h            |  85 +--
include/exec/exec-all.h         |  43 +-
include/exec/translate-all.h    |   6 -
include/qemu/interval-tree.h    |  99 ++++
accel/tcg/tb-maint.c            | 984 ++++++++++++++++++++++++--------
accel/tcg/translate-all.c       | 746 ------------------------
accel/tcg/user-exec.c           | 658 ++++++++++++++++++++-
tests/tcg/multiarch/test-vma.c  |  22 +
tests/unit/test-interval-tree.c | 209 +++++++
util/interval-tree.c            | 882 ++++++++++++++++++++++++++++
tests/unit/meson.build          |   1 +
util/meson.build                |   1 +
12 files changed, 2653 insertions(+), 1083 deletions(-)
create mode 100644 include/qemu/interval-tree.h
create mode 100644 tests/tcg/multiarch/test-vma.c
create mode 100644 tests/unit/test-interval-tree.c
create mode 100644 util/interval-tree.c
[PATCH v3 0/8] accel/tcg: Rewrite user-only vma tracking
Posted by Richard Henderson 1 year, 4 months ago
The primary motivator here are the numerous bug reports (e.g. #290)
about not being able to handle very large memory allocations.
I presume all or most of these are due to guest use of the clang
address sanitizer, which allocates a massive shadow vma.

This patch set copies the linux kernel code for interval trees,
which is what the kernel itself uses for managing vmas.  I then
purge all (real) use of PageDesc from user-only.  This is easy
for user-only because everything tricky happens under mmap_lock();

I have thought only briefly about using interval trees for system
mode too, but the locking situation there is more difficult.  So
for now that code gets moved around but not substantially changed.

The test case from #290 is added to test/tcg/multiarch/.
Before this patch set, on my moderately beefy laptop, it takes 39s
and has an RSS of 28GB before the qemu process is killed.  After
the patch set, the test case successfully allocates 16TB and
completes in 0.013s.


r~


Changes for v3:
  * Rename page_flush_tb to tb_remove_all (new patch 2).
  * Shuffle code in last patch, remove tb_lock for !sysemu for clang.

Changes for v2:
  * Rebase on master, 17 patches merged.
  * Structure of page_get_target_data adjusted (ajb).


Richard Henderson (8):
  util: Add interval-tree.c
  accel/tcg: Rename page_flush_tb
  accel/tcg: Use interval tree for TBs in user-only mode
  accel/tcg: Use interval tree for TARGET_PAGE_DATA_SIZE
  accel/tcg: Move page_{get,set}_flags to user-exec.c
  accel/tcg: Use interval tree for user-only page tracking
  accel/tcg: Move PageDesc tree into tb-maint.c for system
  accel/tcg: Move remainder of page locking to tb-maint.c

 accel/tcg/internal.h            |  85 +--
 include/exec/exec-all.h         |  43 +-
 include/exec/translate-all.h    |   6 -
 include/qemu/interval-tree.h    |  99 ++++
 accel/tcg/tb-maint.c            | 984 ++++++++++++++++++++++++--------
 accel/tcg/translate-all.c       | 746 ------------------------
 accel/tcg/user-exec.c           | 658 ++++++++++++++++++++-
 tests/tcg/multiarch/test-vma.c  |  22 +
 tests/unit/test-interval-tree.c | 209 +++++++
 util/interval-tree.c            | 882 ++++++++++++++++++++++++++++
 tests/unit/meson.build          |   1 +
 util/meson.build                |   1 +
 12 files changed, 2653 insertions(+), 1083 deletions(-)
 create mode 100644 include/qemu/interval-tree.h
 create mode 100644 tests/tcg/multiarch/test-vma.c
 create mode 100644 tests/unit/test-interval-tree.c
 create mode 100644 util/interval-tree.c

-- 
2.34.1