[PATCH v9 00/74] per-CPU locks

Robert Foley posted 74 patches 1 week ago
Test docker-mingw@fedora passed
Test checkpatch passed
Test asan passed
Test docker-quick@centos7 passed
Test FreeBSD passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20200521164011.638-1-robert.foley@linaro.org
Maintainers: Richard Henderson <rth@twiddle.net>, Cornelia Huck <cohuck@redhat.com>, Laurent Vivier <laurent@vivier.eu>, Sunil Muthuswamy <sunilmut@microsoft.com>, Eduardo Habkost <ehabkost@redhat.com>, Bastian Koppelmann <kbastian@mail.uni-paderborn.de>, Max Filippov <jcmvbkbc@gmail.com>, Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>, Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>, David Gibson <david@gibson.dropbear.id.au>, Marcel Apfelbaum <marcel.apfelbaum@gmail.com>, Paolo Bonzini <pbonzini@redhat.com>, Artyom Tarasenko <atar4qemu@gmail.com>, Palmer Dabbelt <palmer@dabbelt.com>, Alistair Francis <Alistair.Francis@wdc.com>, David Hildenbrand <david@redhat.com>, Sagar Karandikar <sagark@eecs.berkeley.edu>, Aleksandar Rikalo <aleksandar.rikalo@rt-rk.com>, Roman Bolshakov <r.bolshakov@yadro.com>, Aurelien Jarno <aurelien@aurel32.net>
accel/tcg/cpu-exec.c            |  40 ++-
accel/tcg/cputlb.c              |  10 +-
accel/tcg/tcg-all.c             |  12 +-
accel/tcg/tcg-runtime.c         |   7 +
accel/tcg/tcg-runtime.h         |   2 +
accel/tcg/translate-all.c       |   2 +-
cpus-common.c                   | 129 +++++++---
cpus.c                          | 435 ++++++++++++++++++++++++++------
exec.c                          |   2 +-
gdbstub.c                       |   4 +-
hw/arm/omap1.c                  |   4 +-
hw/arm/pxa2xx_gpio.c            |   2 +-
hw/arm/pxa2xx_pic.c             |   2 +-
hw/core/cpu.c                   |  29 +--
hw/core/machine-qmp-cmds.c      |   2 +-
hw/core/machine.c               |   1 +
hw/intc/s390_flic.c             |   4 +-
hw/mips/cps.c                   |   2 +-
hw/misc/mips_itu.c              |   4 +-
hw/openrisc/cputimer.c          |   2 +-
hw/ppc/e500.c                   |   4 +-
hw/ppc/ppc.c                    |  12 +-
hw/ppc/ppce500_spin.c           |   6 +-
hw/ppc/spapr_cpu_core.c         |   4 +-
hw/ppc/spapr_hcall.c            |  14 +-
hw/ppc/spapr_rtas.c             |   8 +-
hw/semihosting/console.c        |   4 +-
hw/sparc/leon3.c                |   2 +-
hw/sparc/sun4m.c                |   8 +-
hw/sparc64/sparc64.c            |   8 +-
include/hw/core/cpu.h           | 200 +++++++++++++--
stubs/Makefile.objs             |   1 +
stubs/cpu-lock.c                |  27 ++
target/alpha/cpu.c              |   8 +-
target/alpha/translate.c        |   6 +-
target/arm/arm-powerctl.c       |   6 +-
target/arm/cpu.c                |   8 +-
target/arm/helper.c             |  16 +-
target/arm/machine.c            |   2 +-
target/arm/op_helper.c          |   2 +-
target/cris/cpu.c               |   2 +-
target/cris/helper.c            |   4 +-
target/cris/translate.c         |   5 +-
target/hppa/cpu.c               |   2 +-
target/hppa/translate.c         |   3 +-
target/i386/cpu.c               |   4 +-
target/i386/cpu.h               |   2 +-
target/i386/hax-all.c           |  42 +--
target/i386/helper.c            |   8 +-
target/i386/hvf/hvf.c           |  12 +-
target/i386/hvf/x86hvf.c        |  37 +--
target/i386/kvm.c               |  82 +++---
target/i386/misc_helper.c       |   2 +-
target/i386/seg_helper.c        |  13 +-
target/i386/svm_helper.c        |   6 +-
target/i386/whpx-all.c          |  57 +++--
target/lm32/cpu.c               |   2 +-
target/lm32/op_helper.c         |   4 +-
target/m68k/cpu.c               |   2 +-
target/m68k/op_helper.c         |   2 +-
target/m68k/translate.c         |   9 +-
target/microblaze/cpu.c         |   2 +-
target/microblaze/translate.c   |   4 +-
target/mips/cp0_helper.c        |   6 +-
target/mips/cpu.c               |  11 +-
target/mips/kvm.c               |   4 +-
target/mips/op_helper.c         |   2 +-
target/mips/translate.c         |   4 +-
target/moxie/cpu.c              |   2 +-
target/nios2/cpu.c              |   2 +-
target/openrisc/cpu.c           |   4 +-
target/openrisc/sys_helper.c    |   4 +-
target/ppc/excp_helper.c        |   6 +-
target/ppc/helper_regs.h        |   2 +-
target/ppc/kvm.c                |   6 +-
target/ppc/translate.c          |   6 +-
target/ppc/translate_init.inc.c |  41 +--
target/riscv/cpu.c              |   5 +-
target/riscv/op_helper.c        |   2 +-
target/s390x/cpu.c              |  28 +-
target/s390x/excp_helper.c      |   4 +-
target/s390x/kvm.c              |   2 +-
target/s390x/sigp.c             |   8 +-
target/sh4/cpu.c                |   2 +-
target/sh4/helper.c             |   2 +-
target/sh4/op_helper.c          |   2 +-
target/sparc/cpu.c              |   6 +-
target/sparc/helper.c           |   2 +-
target/unicore32/cpu.c          |   2 +-
target/unicore32/softmmu.c      |   2 +-
target/xtensa/cpu.c             |   6 +-
target/xtensa/exc_helper.c      |   2 +-
target/xtensa/helper.c          |   2 +-
93 files changed, 1060 insertions(+), 464 deletions(-)
create mode 100644 stubs/cpu-lock.c

[PATCH v9 00/74] per-CPU locks

Posted by Robert Foley 1 week ago
v8: https://lists.gnu.org/archive/html/qemu-devel/2020-03/msg08031.html

Quoting an earlier patch in the series:
"For context, the goal of this series is to substitute the BQL for the
per-CPU locks in many places, notably the execution loop in cpus.c.
This leads to better scalability for MTTCG, since CPUs don't have
to acquire a contended global lock (the BQL) every time they
stop executing code.
See the last commit for some performance numbers."

Listed below are the changes for this version of the patch, 
aside from the merge related changes.

Changes for v9:
- Fixed merge issue in cpu_common_finalize().
- relocate CPU_LOCK_BITMAP_SIZE from cpus.c to hw/core/cpu.h.
   assert on it in smp_parse() from hw/core/machine.c.
- Modified stubs/cpu-lock.c to  have an empty implementation of 
  cpu_mutex_lock_impl/unlock_impl.
- Cleaned up many commits to remove CCs where the review or ack 
  was already added.

Emilio G. Cota (69):
  cpu: convert queued work to a QSIMPLEQ
  cpu: rename cpu->work_mutex to cpu->lock
  cpu: introduce cpu_mutex_lock/unlock
  cpu: make qemu_work_cond per-cpu
  cpu: move run_on_cpu to cpus-common
  cpu: introduce process_queued_cpu_work_locked
  cpu: make per-CPU locks an alias of the BQL in TCG rr mode
  tcg-runtime: define helper_cpu_halted_set
  ppc: convert to helper_cpu_halted_set
  cris: convert to helper_cpu_halted_set
  hppa: convert to helper_cpu_halted_set
  m68k: convert to helper_cpu_halted_set
  alpha: convert to helper_cpu_halted_set
  microblaze: convert to helper_cpu_halted_set
  cpu: define cpu_halted helpers
  tcg-runtime: convert to cpu_halted_set
  arm: convert to cpu_halted
  ppc: convert to cpu_halted
  sh4: convert to cpu_halted
  i386: convert to cpu_halted
  lm32: convert to cpu_halted
  m68k: convert to cpu_halted
  mips: convert to cpu_halted
  riscv: convert to cpu_halted
  s390x: convert to cpu_halted
  sparc: convert to cpu_halted
  xtensa: convert to cpu_halted
  gdbstub: convert to cpu_halted
  openrisc: convert to cpu_halted
  cpu-exec: convert to cpu_halted
  cpu: convert to cpu_halted
  cpu: define cpu_interrupt_request helpers
  exec: use cpu_reset_interrupt
  arm: convert to cpu_interrupt_request
  i386: convert to cpu_interrupt_request
  i386/kvm: convert to cpu_interrupt_request
  i386/hax-all: convert to cpu_interrupt_request
  i386/whpx-all: convert to cpu_interrupt_request
  i386/hvf: convert to cpu_request_interrupt
  ppc: convert to cpu_interrupt_request
  sh4: convert to cpu_interrupt_request
  cris: convert to cpu_interrupt_request
  hppa: convert to cpu_interrupt_request
  lm32: convert to cpu_interrupt_request
  m68k: convert to cpu_interrupt_request
  mips: convert to cpu_interrupt_request
  nios: convert to cpu_interrupt_request
  s390x: convert to cpu_interrupt_request
  alpha: convert to cpu_interrupt_request
  moxie: convert to cpu_interrupt_request
  sparc: convert to cpu_interrupt_request
  openrisc: convert to cpu_interrupt_request
  unicore32: convert to cpu_interrupt_request
  microblaze: convert to cpu_interrupt_request
  accel/tcg: convert to cpu_interrupt_request
  cpu: convert to interrupt_request
  cpu: call .cpu_has_work with the CPU lock held
  cpu: introduce cpu_has_work_with_iothread_lock
  ppc: convert to cpu_has_work_with_iothread_lock
  mips: convert to cpu_has_work_with_iothread_lock
  s390x: convert to cpu_has_work_with_iothread_lock
  riscv: convert to cpu_has_work_with_iothread_lock
  sparc: convert to cpu_has_work_with_iothread_lock
  xtensa: convert to cpu_has_work_with_iothread_lock
  cpu: rename all_cpu_threads_idle to qemu_tcg_rr_all_cpu_threads_idle
  cpu: protect CPU state with cpu->lock instead of the BQL
  cpus-common: release BQL earlier in run_on_cpu
  cpu: add async_run_on_cpu_no_bql
  cputlb: queue async flush jobs without the BQL

Paolo Bonzini (4):
  ppc: use cpu_reset_interrupt
  i386: use cpu_reset_interrupt
  s390x: use cpu_reset_interrupt
  openrisc: use cpu_reset_interrupt

Robert Foley (1):
  hw/semihosting: convert to cpu_halted_set

 accel/tcg/cpu-exec.c            |  40 ++-
 accel/tcg/cputlb.c              |  10 +-
 accel/tcg/tcg-all.c             |  12 +-
 accel/tcg/tcg-runtime.c         |   7 +
 accel/tcg/tcg-runtime.h         |   2 +
 accel/tcg/translate-all.c       |   2 +-
 cpus-common.c                   | 129 +++++++---
 cpus.c                          | 435 ++++++++++++++++++++++++++------
 exec.c                          |   2 +-
 gdbstub.c                       |   4 +-
 hw/arm/omap1.c                  |   4 +-
 hw/arm/pxa2xx_gpio.c            |   2 +-
 hw/arm/pxa2xx_pic.c             |   2 +-
 hw/core/cpu.c                   |  29 +--
 hw/core/machine-qmp-cmds.c      |   2 +-
 hw/core/machine.c               |   1 +
 hw/intc/s390_flic.c             |   4 +-
 hw/mips/cps.c                   |   2 +-
 hw/misc/mips_itu.c              |   4 +-
 hw/openrisc/cputimer.c          |   2 +-
 hw/ppc/e500.c                   |   4 +-
 hw/ppc/ppc.c                    |  12 +-
 hw/ppc/ppce500_spin.c           |   6 +-
 hw/ppc/spapr_cpu_core.c         |   4 +-
 hw/ppc/spapr_hcall.c            |  14 +-
 hw/ppc/spapr_rtas.c             |   8 +-
 hw/semihosting/console.c        |   4 +-
 hw/sparc/leon3.c                |   2 +-
 hw/sparc/sun4m.c                |   8 +-
 hw/sparc64/sparc64.c            |   8 +-
 include/hw/core/cpu.h           | 200 +++++++++++++--
 stubs/Makefile.objs             |   1 +
 stubs/cpu-lock.c                |  27 ++
 target/alpha/cpu.c              |   8 +-
 target/alpha/translate.c        |   6 +-
 target/arm/arm-powerctl.c       |   6 +-
 target/arm/cpu.c                |   8 +-
 target/arm/helper.c             |  16 +-
 target/arm/machine.c            |   2 +-
 target/arm/op_helper.c          |   2 +-
 target/cris/cpu.c               |   2 +-
 target/cris/helper.c            |   4 +-
 target/cris/translate.c         |   5 +-
 target/hppa/cpu.c               |   2 +-
 target/hppa/translate.c         |   3 +-
 target/i386/cpu.c               |   4 +-
 target/i386/cpu.h               |   2 +-
 target/i386/hax-all.c           |  42 +--
 target/i386/helper.c            |   8 +-
 target/i386/hvf/hvf.c           |  12 +-
 target/i386/hvf/x86hvf.c        |  37 +--
 target/i386/kvm.c               |  82 +++---
 target/i386/misc_helper.c       |   2 +-
 target/i386/seg_helper.c        |  13 +-
 target/i386/svm_helper.c        |   6 +-
 target/i386/whpx-all.c          |  57 +++--
 target/lm32/cpu.c               |   2 +-
 target/lm32/op_helper.c         |   4 +-
 target/m68k/cpu.c               |   2 +-
 target/m68k/op_helper.c         |   2 +-
 target/m68k/translate.c         |   9 +-
 target/microblaze/cpu.c         |   2 +-
 target/microblaze/translate.c   |   4 +-
 target/mips/cp0_helper.c        |   6 +-
 target/mips/cpu.c               |  11 +-
 target/mips/kvm.c               |   4 +-
 target/mips/op_helper.c         |   2 +-
 target/mips/translate.c         |   4 +-
 target/moxie/cpu.c              |   2 +-
 target/nios2/cpu.c              |   2 +-
 target/openrisc/cpu.c           |   4 +-
 target/openrisc/sys_helper.c    |   4 +-
 target/ppc/excp_helper.c        |   6 +-
 target/ppc/helper_regs.h        |   2 +-
 target/ppc/kvm.c                |   6 +-
 target/ppc/translate.c          |   6 +-
 target/ppc/translate_init.inc.c |  41 +--
 target/riscv/cpu.c              |   5 +-
 target/riscv/op_helper.c        |   2 +-
 target/s390x/cpu.c              |  28 +-
 target/s390x/excp_helper.c      |   4 +-
 target/s390x/kvm.c              |   2 +-
 target/s390x/sigp.c             |   8 +-
 target/sh4/cpu.c                |   2 +-
 target/sh4/helper.c             |   2 +-
 target/sh4/op_helper.c          |   2 +-
 target/sparc/cpu.c              |   6 +-
 target/sparc/helper.c           |   2 +-
 target/unicore32/cpu.c          |   2 +-
 target/unicore32/softmmu.c      |   2 +-
 target/xtensa/cpu.c             |   6 +-
 target/xtensa/exc_helper.c      |   2 +-
 target/xtensa/helper.c          |   2 +-
 93 files changed, 1060 insertions(+), 464 deletions(-)
 create mode 100644 stubs/cpu-lock.c

-- 
2.17.1