[Qemu-devel] [PATCH for-4.0 v2 00/37] tcg: Assorted cleanups

Richard Henderson posted 37 patches 5 years, 5 months ago
Test asan passed
Test checkpatch failed
Test docker-quick@centos7 passed
Test docker-mingw@fedora passed
Test docker-clang@ubuntu passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20181123144558.5048-1-richard.henderson@linaro.org
There is a newer version of this series
tcg/aarch64/tcg-target.h     |   3 +-
tcg/arm/tcg-target.h         |   7 +-
tcg/i386/tcg-target.h        |  19 +-
tcg/mips/tcg-target.h        |   1 +
tcg/ppc/tcg-target.h         |   3 +-
tcg/s390/tcg-target.h        |   1 +
tcg/sparc/tcg-target.h       |   1 +
tcg/tcg.h                    |   5 +
tcg/tci/tcg-target.h         |   2 +
accel/tcg/translate-all.c    |  15 +-
tcg/aarch64/tcg-target.inc.c | 369 ++++++++--------
tcg/arm/tcg-target.inc.c     | 643 +++++++++++++--------------
tcg/i386/tcg-target.inc.c    | 821 ++++++++++++++++++++---------------
tcg/mips/tcg-target.inc.c    |  29 +-
tcg/optimize.c               |  12 +
tcg/ppc/tcg-target.inc.c     | 562 +++++++++++++-----------
tcg/s390/tcg-target.inc.c    |  37 +-
tcg/sparc/tcg-target.inc.c   |  13 +-
tcg/tcg-ldst-ool.inc.c       |  95 ++++
tcg/tcg-op.c                 | 215 ++++++---
tcg/tcg-pool.inc.c           |   5 +-
tcg/tcg.c                    |  36 +-
tcg/tci/tcg-target.inc.c     |   3 +-
23 files changed, 1628 insertions(+), 1269 deletions(-)
create mode 100644 tcg/tcg-ldst-ool.inc.c
[Qemu-devel] [PATCH for-4.0 v2 00/37] tcg: Assorted cleanups
Posted by Richard Henderson 5 years, 5 months ago
This includes everything queued so far -- softmmu out-of-line
patches, bswap cleanups, and (new) eliminating all scratch
registers from x86 user-only memops.

This tree is now at

  https://github.com/rth7680/qemu.git tcg-next-for-4.0

for future tcg/riscv/ rebasing.


r~


Richard Henderson (37):
  tcg/i386: Always use %ebp for TCG_AREG0
  tcg/i386: Move TCG_REG_CALL_STACK from define to enum
  tcg: Return success from patch_reloc
  tcg: Add TCG_TARGET_NEED_LDST_OOL_LABELS
  tcg/i386: Add constraints for r8 and r9
  tcg/i386: Return a base register from tcg_out_tlb_load
  tcg/i386: Change TCG_REG_L[01] to not overlap function arguments
  tcg/i386: Force qemu_ld/st arguments into fixed registers
  tcg/i386: Use TCG_TARGET_NEED_LDST_OOL_LABELS
  tcg/aarch64: Add constraints for x0, x1, x2
  tcg/aarch64: Parameterize the temps for tcg_out_tlb_read
  tcg/aarch64: Parameterize the temp for tcg_out_goto_long
  tcg/aarch64: Use B not BL for tcg_out_goto_long
  tcg/aarch64: Use TCG_TARGET_NEED_LDST_OOL_LABELS
  tcg/arm: Parameterize the temps for tcg_out_tlb_read
  tcg/arm: Add constraints for R0-R5
  tcg/arm: Reduce the number of temps for tcg_out_tlb_read
  tcg/arm: Force qemu_ld/st arguments into fixed registers
  tcg/arm: Use TCG_TARGET_NEED_LDST_OOL_LABELS
  tcg/ppc: Parameterize the temps for tcg_out_tlb_read
  tcg/ppc: Split out tcg_out_call_int
  tcg/ppc: Add constraints for R7-R8
  tcg/ppc: Change TCG_TARGET_CALL_ALIGN_ARGS to bool
  tcg/ppc: Force qemu_ld/st arguments into fixed registers
  tcg/ppc: Use TCG_TARGET_NEED_LDST_OOL_LABELS
  tcg: Clean up generic bswap32
  tcg: Clean up generic bswap64
  tcg/optimize: Optimize bswap
  tcg: Add TCG_TARGET_HAS_MEMORY_BSWAP
  tcg/i386: Adjust TCG_TARGET_HAS_MEMORY_BSWAP
  tcg/aarch64: Set TCG_TARGET_HAS_MEMORY_BSWAP to false
  tcg/arm: Set TCG_TARGET_HAS_MEMORY_BSWAP to false for user-only
  tcg/i386: Propagate is64 to tcg_out_qemu_ld_direct
  tcg/i386: Restrict user-only qemu_st_i32 values to q-regs
  tcg/i386: Add setup_guest_base_seg for FreeBSD
  tcg/i386: Require segment syscalls to succeed
  tcg/i386: Remove L constraint

 tcg/aarch64/tcg-target.h     |   3 +-
 tcg/arm/tcg-target.h         |   7 +-
 tcg/i386/tcg-target.h        |  19 +-
 tcg/mips/tcg-target.h        |   1 +
 tcg/ppc/tcg-target.h         |   3 +-
 tcg/s390/tcg-target.h        |   1 +
 tcg/sparc/tcg-target.h       |   1 +
 tcg/tcg.h                    |   5 +
 tcg/tci/tcg-target.h         |   2 +
 accel/tcg/translate-all.c    |  15 +-
 tcg/aarch64/tcg-target.inc.c | 369 ++++++++--------
 tcg/arm/tcg-target.inc.c     | 643 +++++++++++++--------------
 tcg/i386/tcg-target.inc.c    | 821 ++++++++++++++++++++---------------
 tcg/mips/tcg-target.inc.c    |  29 +-
 tcg/optimize.c               |  12 +
 tcg/ppc/tcg-target.inc.c     | 562 +++++++++++++-----------
 tcg/s390/tcg-target.inc.c    |  37 +-
 tcg/sparc/tcg-target.inc.c   |  13 +-
 tcg/tcg-ldst-ool.inc.c       |  95 ++++
 tcg/tcg-op.c                 | 215 ++++++---
 tcg/tcg-pool.inc.c           |   5 +-
 tcg/tcg.c                    |  36 +-
 tcg/tci/tcg-target.inc.c     |   3 +-
 23 files changed, 1628 insertions(+), 1269 deletions(-)
 create mode 100644 tcg/tcg-ldst-ool.inc.c

-- 
2.17.2


Re: [Qemu-devel] [PATCH for-4.0 v2 00/37] tcg: Assorted cleanups
Posted by no-reply@patchew.org 5 years, 5 months ago
Hi,

This series seems to have some coding style problems. See output below for
more information:

Message-id: 20181123144558.5048-1-richard.henderson@linaro.org
Type: series
Subject: [Qemu-devel] [PATCH for-4.0 v2 00/37] tcg: Assorted cleanups

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
    echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
    if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
        failed=1
        echo
    fi
    n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
fc046e8 tcg/i386: Remove L constraint
25a6e0e tcg/i386: Require segment syscalls to succeed
5e0fd5d tcg/i386: Add setup_guest_base_seg for FreeBSD
d68fe20 tcg/i386: Restrict user-only qemu_st_i32 values to q-regs
85fcd2d tcg/i386: Propagate is64 to tcg_out_qemu_ld_direct
a3178a9 tcg/arm: Set TCG_TARGET_HAS_MEMORY_BSWAP to false for user-only
4d0c5fc tcg/aarch64: Set TCG_TARGET_HAS_MEMORY_BSWAP to false
676c67e tcg/i386: Adjust TCG_TARGET_HAS_MEMORY_BSWAP
38c85b7 tcg: Add TCG_TARGET_HAS_MEMORY_BSWAP
d3a1899 tcg/optimize: Optimize bswap
3ad8388 tcg: Clean up generic bswap64
affd2d8 tcg: Clean up generic bswap32
fd89430 tcg/ppc: Use TCG_TARGET_NEED_LDST_OOL_LABELS
8aa5784 tcg/ppc: Force qemu_ld/st arguments into fixed registers
af59cb6 tcg/ppc: Change TCG_TARGET_CALL_ALIGN_ARGS to bool
26bfc9a tcg/ppc: Add constraints for R7-R8
6cd8567 tcg/ppc: Split out tcg_out_call_int
d40e505 tcg/ppc: Parameterize the temps for tcg_out_tlb_read
acfcb89 tcg/arm: Use TCG_TARGET_NEED_LDST_OOL_LABELS
5516052 tcg/arm: Force qemu_ld/st arguments into fixed registers
46ad1f6 tcg/arm: Reduce the number of temps for tcg_out_tlb_read
24abc30 tcg/arm: Add constraints for R0-R5
94b24c4 tcg/arm: Parameterize the temps for tcg_out_tlb_read
91b0db1 tcg/aarch64: Use TCG_TARGET_NEED_LDST_OOL_LABELS
7129f8e tcg/aarch64: Use B not BL for tcg_out_goto_long
c770cee tcg/aarch64: Parameterize the temp for tcg_out_goto_long
412bf17 tcg/aarch64: Parameterize the temps for tcg_out_tlb_read
2ffa3ee tcg/aarch64: Add constraints for x0, x1, x2
a403293 tcg/i386: Use TCG_TARGET_NEED_LDST_OOL_LABELS
a65b8a9 tcg/i386: Force qemu_ld/st arguments into fixed registers
4053d4c tcg/i386: Change TCG_REG_L[01] to not overlap function arguments
c9e6cd5 tcg/i386: Return a base register from tcg_out_tlb_load
a601058 tcg/i386: Add constraints for r8 and r9
41701df tcg: Add TCG_TARGET_NEED_LDST_OOL_LABELS
fdb09e0 tcg: Return success from patch_reloc
8dab265 tcg/i386: Move TCG_REG_CALL_STACK from define to enum
58990b6 tcg/i386: Always use %ebp for TCG_AREG0

=== OUTPUT BEGIN ===
Checking PATCH 1/37: tcg/i386: Always use %ebp for TCG_AREG0...
Checking PATCH 2/37: tcg/i386: Move TCG_REG_CALL_STACK from define to enum...
Checking PATCH 3/37: tcg: Return success from patch_reloc...
Checking PATCH 4/37: tcg: Add TCG_TARGET_NEED_LDST_OOL_LABELS...
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#47: 
new file mode 100644

total: 0 errors, 1 warnings, 192 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
Checking PATCH 5/37: tcg/i386: Add constraints for r8 and r9...
Checking PATCH 6/37: tcg/i386: Return a base register from tcg_out_tlb_load...
Checking PATCH 7/37: tcg/i386: Change TCG_REG_L[01] to not overlap function arguments...
Checking PATCH 8/37: tcg/i386: Force qemu_ld/st arguments into fixed registers...
Checking PATCH 9/37: tcg/i386: Use TCG_TARGET_NEED_LDST_OOL_LABELS...
Checking PATCH 10/37: tcg/aarch64: Add constraints for x0, x1, x2...
Checking PATCH 11/37: tcg/aarch64: Parameterize the temps for tcg_out_tlb_read...
Checking PATCH 12/37: tcg/aarch64: Parameterize the temp for tcg_out_goto_long...
Checking PATCH 13/37: tcg/aarch64: Use B not BL for tcg_out_goto_long...
Checking PATCH 14/37: tcg/aarch64: Use TCG_TARGET_NEED_LDST_OOL_LABELS...
Checking PATCH 15/37: tcg/arm: Parameterize the temps for tcg_out_tlb_read...
Checking PATCH 16/37: tcg/arm: Add constraints for R0-R5...
Checking PATCH 17/37: tcg/arm: Reduce the number of temps for tcg_out_tlb_read...
Checking PATCH 18/37: tcg/arm: Force qemu_ld/st arguments into fixed registers...
Checking PATCH 19/37: tcg/arm: Use TCG_TARGET_NEED_LDST_OOL_LABELS...
ERROR: externs should be avoided in .c files
#169: FILE: tcg/arm/tcg-target.inc.c:1483:
+    TCGReg addrlo __attribute__((unused));

ERROR: externs should be avoided in .c files
#170: FILE: tcg/arm/tcg-target.inc.c:1484:
+    TCGReg addrhi __attribute__((unused));

ERROR: externs should be avoided in .c files
#171: FILE: tcg/arm/tcg-target.inc.c:1485:
+    TCGReg datalo __attribute__((unused));

ERROR: externs should be avoided in .c files
#172: FILE: tcg/arm/tcg-target.inc.c:1486:
+    TCGReg datahi __attribute__((unused));

ERROR: externs should be avoided in .c files
#224: FILE: tcg/arm/tcg-target.inc.c:1602:
+    TCGReg addrlo __attribute__((unused));

ERROR: externs should be avoided in .c files
#225: FILE: tcg/arm/tcg-target.inc.c:1603:
+    TCGReg addrhi __attribute__((unused));

ERROR: externs should be avoided in .c files
#226: FILE: tcg/arm/tcg-target.inc.c:1604:
+    TCGReg datalo __attribute__((unused));

ERROR: externs should be avoided in .c files
#227: FILE: tcg/arm/tcg-target.inc.c:1605:
+    TCGReg datahi __attribute__((unused));

total: 8 errors, 0 warnings, 368 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 20/37: tcg/ppc: Parameterize the temps for tcg_out_tlb_read...
Checking PATCH 21/37: tcg/ppc: Split out tcg_out_call_int...
Checking PATCH 22/37: tcg/ppc: Add constraints for R7-R8...
Checking PATCH 23/37: tcg/ppc: Change TCG_TARGET_CALL_ALIGN_ARGS to bool...
Checking PATCH 24/37: tcg/ppc: Force qemu_ld/st arguments into fixed registers...
ERROR: do not initialise statics to 0 or NULL
#52: FILE: tcg/ppc/tcg-target.inc.c:1749:
+    static bool is_be = false;

total: 1 errors, 0 warnings, 207 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 25/37: tcg/ppc: Use TCG_TARGET_NEED_LDST_OOL_LABELS...
ERROR: externs should be avoided in .c files
#262: FILE: tcg/ppc/tcg-target.inc.c:1690:
+    TCGReg datalo __attribute__((unused));

ERROR: externs should be avoided in .c files
#263: FILE: tcg/ppc/tcg-target.inc.c:1691:
+    TCGReg datahi __attribute__((unused));

ERROR: externs should be avoided in .c files
#264: FILE: tcg/ppc/tcg-target.inc.c:1692:
+    TCGReg addrlo __attribute__((unused));

ERROR: externs should be avoided in .c files
#322: FILE: tcg/ppc/tcg-target.inc.c:1748:
+    TCGReg datalo __attribute__((unused));

ERROR: externs should be avoided in .c files
#323: FILE: tcg/ppc/tcg-target.inc.c:1749:
+    TCGReg datahi __attribute__((unused));

ERROR: externs should be avoided in .c files
#324: FILE: tcg/ppc/tcg-target.inc.c:1750:
+    TCGReg addrlo __attribute__((unused));

ERROR: externs should be avoided in .c files
#325: FILE: tcg/ppc/tcg-target.inc.c:1751:
+    TCGReg addrhi __attribute__((unused));

total: 7 errors, 0 warnings, 414 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 26/37: tcg: Clean up generic bswap32...
Checking PATCH 27/37: tcg: Clean up generic bswap64...
Checking PATCH 28/37: tcg/optimize: Optimize bswap...
ERROR: spaces required around that ':' (ctx:VxE)
#21: FILE: tcg/optimize.c:356:
+    CASE_OP_32_64(bswap16):
                           ^

ERROR: spaces required around that ':' (ctx:VxE)
#24: FILE: tcg/optimize.c:359:
+    CASE_OP_32_64(bswap32):
                           ^

ERROR: spaces required around that ':' (ctx:VxE)
#37: FILE: tcg/optimize.c:1117:
+        CASE_OP_32_64(bswap16):
                               ^

ERROR: spaces required around that ':' (ctx:VxE)
#38: FILE: tcg/optimize.c:1118:
+        CASE_OP_32_64(bswap32):
                               ^

total: 4 errors, 0 warnings, 24 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 29/37: tcg: Add TCG_TARGET_HAS_MEMORY_BSWAP...
Checking PATCH 30/37: tcg/i386: Adjust TCG_TARGET_HAS_MEMORY_BSWAP...
Checking PATCH 31/37: tcg/aarch64: Set TCG_TARGET_HAS_MEMORY_BSWAP to false...
Checking PATCH 32/37: tcg/arm: Set TCG_TARGET_HAS_MEMORY_BSWAP to false for user-only...
ERROR: space prohibited after that '&' (ctx:WxW)
#206: FILE: tcg/arm/tcg-target.inc.c:1525:
+            && (datalo & 1) == 0 && datahi == datalo + 1) {
                        ^

total: 1 errors, 0 warnings, 289 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 33/37: tcg/i386: Propagate is64 to tcg_out_qemu_ld_direct...
Checking PATCH 34/37: tcg/i386: Restrict user-only qemu_st_i32 values to q-regs...
Checking PATCH 35/37: tcg/i386: Add setup_guest_base_seg for FreeBSD...
ERROR: space prohibited between function name and open parenthesis '('
#17: FILE: tcg/i386/tcg-target.inc.c:1821:
+#elif defined (__FreeBSD__) || defined (__FreeBSD_kernel__)

ERROR: space prohibited between function name and open parenthesis '('
#17: FILE: tcg/i386/tcg-target.inc.c:1821:
+#elif defined (__FreeBSD__) || defined (__FreeBSD_kernel__)

total: 2 errors, 0 warnings, 16 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 36/37: tcg/i386: Require segment syscalls to succeed...
Checking PATCH 37/37: tcg/i386: Remove L constraint...
=== OUTPUT END ===

Test command exited with code: 1


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-devel@redhat.com
Re: [Qemu-devel] [PATCH for-4.0 v2 00/37] tcg: Assorted cleanups
Posted by Emilio G. Cota 5 years, 4 months ago
On Fri, Nov 23, 2018 at 15:45:21 +0100, Richard Henderson wrote:
> This includes everything queued so far -- softmmu out-of-line
> patches

Reviewed-by: Emilio G. Cota <cota@braap.org>
for patches 1-9.

I am sad to report that on a Skylake host, this series gives
a ~10% average slowdown for x86_64-softmmu SPEC06int
(I'm reporting speedup, so <1 means slowdown):
  https://imgur.com/a/25iu8Yl

Turns out that despite the higher icache hit, the IPC
ends up being lower. For instance, here are perf counts when
running hmmer x3 right after booting up (bootup is included
in the counts, but hmmer is run 3 times in a row):

- Before:
   249,392,070,159      cycles
   781,327,593,681      instructions              #    3.13  insn per cycle
    85,914,418,873      branches
       242,572,820      branch-misses             #    0.28% of all branches
     1,567,954,032      L1-icache-load-misses

      70.559864567 seconds time elapsed

- After:
   277,806,651,701      cycles
   813,619,725,225      instructions              #    2.93  insn per cycle
   132,453,633,831      branches
       306,969,989      branch-misses             #    0.23% of all branches
     1,250,619,057      L1-icache-load-misses

      78.420517079 seconds time elapsed

On the bright side, in an older system (Sandy Bridge), I get
a fairly neutral average perf impact, with some workloads
speeding up and others slowing down:
  https://imgur.com/a/AokDbkm
(Note that v1 of this series gave an overall slowdown, so that's
progress.)

Given the above, perhaps the best way forward is to add a
configure flag to disable OOL thunks, unless you have any
further optimizations coming up.

Thanks,

		Emilio