accel/tcg/Makefile.objs | 1 + accel/tcg/cputlb.c | 63 +---- accel/tcg/cputlb.h | 21 ++ accel/tcg/softmmu.c | 452 +++++++++++++++++++++++++++++++++++ accel/tcg/softmmu_template.h | 446 ---------------------------------- 5 files changed, 485 insertions(+), 498 deletions(-) create mode 100644 accel/tcg/cputlb.h create mode 100644 accel/tcg/softmmu.c delete mode 100644 accel/tcg/softmmu_template.h
Hi, This is a re-based re-spin of my RFC series with a few minor updates thanks to updates to the softmmu code. The series is attempting to achieve the same results we did with softfloat by replacing our nest of macro expansions with some common helpers. By using __flatten__ to force the whole function to be un-rolled the compiler can then dead-code away parts of the common function not used in each particular instantiated helper. This is a win from a debugging point of view - no more "where is helper_le_lduw_mmu defined" questions. I think it will also make eventual instrumentation easier because there is exactly one slow path load and store function to instrument. Although the individual functions are now bigger (__flatten__ stops the compiler ignoring inline if it wants to) the size of the resulting binary is slightly smaller! original - 73027224 bytes demacro - 66913848 bytes Unfortunately in my simple boot test I see a slight performance degradation: original: 10 times (100.00%), avg time 5.358 (0.02 varience/0.13 deviation) demacro: 10 times (100.00%), avg time 5.760 (0.08 varience/0.29 deviation) Emilio, Any chance you could run this through your more comprehensive benchmark suite? Alex Bennée (4): accel/tcg: export some cputlb functions accel/tcg: introduce softmmu.c accel/tcg: use TLB helpers from softmmu.o accel/tcg: remove softmmu_template.h accel/tcg/Makefile.objs | 1 + accel/tcg/cputlb.c | 63 +---- accel/tcg/cputlb.h | 21 ++ accel/tcg/softmmu.c | 452 +++++++++++++++++++++++++++++++++++ accel/tcg/softmmu_template.h | 446 ---------------------------------- 5 files changed, 485 insertions(+), 498 deletions(-) create mode 100644 accel/tcg/cputlb.h create mode 100644 accel/tcg/softmmu.c delete mode 100644 accel/tcg/softmmu_template.h -- 2.17.1
Alex Bennée <alex.bennee@linaro.org> writes: > Hi, > <snip> > > Unfortunately in my simple boot test I see a slight performance > degradation: > > original: 10 times (100.00%), avg time 5.358 (0.02 varience/0.13 deviation) > demacro: 10 times (100.00%), avg time 5.760 (0.08 varience/0.29 deviation) Moving stuff back into cputlb seems to help: 10 times (100.00%), avg time 5.583 (0.03 varience/0.17 deviation) > > Emilio, > > Any chance you could run this through your more comprehensive benchmark > suite? > > Alex Bennée (4): > accel/tcg: export some cputlb functions > accel/tcg: introduce softmmu.c > accel/tcg: use TLB helpers from softmmu.o > accel/tcg: remove softmmu_template.h > > accel/tcg/Makefile.objs | 1 + > accel/tcg/cputlb.c | 63 +---- > accel/tcg/cputlb.h | 21 ++ > accel/tcg/softmmu.c | 452 +++++++++++++++++++++++++++++++++++ > accel/tcg/softmmu_template.h | 446 ---------------------------------- > 5 files changed, 485 insertions(+), 498 deletions(-) > create mode 100644 accel/tcg/cputlb.h > create mode 100644 accel/tcg/softmmu.c > delete mode 100644 accel/tcg/softmmu_template.h -- Alex Bennée
Alex Bennée <alex.bennee@linaro.org> writes: > Alex Bennée <alex.bennee@linaro.org> writes: > >> Hi, >> > <snip> >> >> Unfortunately in my simple boot test I see a slight performance >> degradation: >> >> original: 10 times (100.00%), avg time 5.358 (0.02 varience/0.13 deviation) >> demacro: 10 times (100.00%), avg time 5.760 (0.08 varience/0.29 deviation) > > Moving stuff back into cputlb seems to help: > > 10 times (100.00%), avg time 5.583 (0.03 varience/0.17 deviation) See: https://github.com/stsquad/qemu/tree/ldst/demacrofy-v2 Which: - keeps everything in cputlb (dropping the externs) - factors out unaligned handling - uses __always_inline__ instead of __flatten__ > >> >> Emilio, >> >> Any chance you could run this through your more comprehensive benchmark >> suite? >> >> Alex Bennée (4): >> accel/tcg: export some cputlb functions >> accel/tcg: introduce softmmu.c >> accel/tcg: use TLB helpers from softmmu.o >> accel/tcg: remove softmmu_template.h >> >> accel/tcg/Makefile.objs | 1 + >> accel/tcg/cputlb.c | 63 +---- >> accel/tcg/cputlb.h | 21 ++ >> accel/tcg/softmmu.c | 452 +++++++++++++++++++++++++++++++++++ >> accel/tcg/softmmu_template.h | 446 ---------------------------------- >> 5 files changed, 485 insertions(+), 498 deletions(-) >> create mode 100644 accel/tcg/cputlb.h >> create mode 100644 accel/tcg/softmmu.c >> delete mode 100644 accel/tcg/softmmu_template.h -- Alex Bennée
On Mon, Dec 17, 2018 at 16:15:10 +0000, Alex Bennée wrote: > > Alex Bennée <alex.bennee@linaro.org> writes: > > > Hi, > > > <snip> > > > > Unfortunately in my simple boot test I see a slight performance > > degradation: > > > > original: 10 times (100.00%), avg time 5.358 (0.02 varience/0.13 deviation) > > demacro: 10 times (100.00%), avg time 5.760 (0.08 varience/0.29 deviation) > > Moving stuff back into cputlb seems to help: > > 10 times (100.00%), avg time 5.583 (0.03 varience/0.17 deviation) Yes, I'd move it there. Also playing with attr __noinline__ for the slow paths like we did for hardfloat should help. > > Emilio, > > > > Any chance you could run this through your more comprehensive benchmark > > suite? Sure, please give me a few days (got a paper submission deadline on Wed). Do you have a branch I can pull from? Thanks, Emilio
Patchew URL: https://patchew.org/QEMU/20181217150116.10446-1-alex.bennee@linaro.org/ Hi, This series failed the docker-mingw@fedora build test. Please find the testing commands and their output below. If you have Docker installed, you can probably reproduce it locally. === TEST SCRIPT BEGIN === #!/bin/bash time make docker-test-mingw@fedora SHOW_ENV=1 J=8 === TEST SCRIPT END === CC x86_64-softmmu/target/i386/cpu.o CC x86_64-softmmu/target/i386/gdbstub.o CC aarch64-softmmu/hw/block/virtio-blk.o /tmp/qemu-test/src/accel/tcg/softmmu.c:207:1: error: conflicting types for 'helper_le_ldq_mmu' helper_le_ldq_mmu(CPUArchState *env, target_ulong addr, TCGMemOpIdx oi, ^~~~~~~~~~~~~~~~~ In file included from /tmp/qemu-test/src/include/exec/cpu_ldst.h:127:0, --- /tmp/qemu-test/src/tcg/tcg.h:1307:10: note: previous declaration of 'helper_le_ldq_mmu' was here uint64_t helper_le_ldq_mmu(CPUArchState *env, target_ulong addr, ^~~~~~~~~~~~~~~~~ /tmp/qemu-test/src/accel/tcg/softmmu.c:214:1: error: conflicting types for 'helper_be_ldq_mmu' helper_be_ldq_mmu(CPUArchState *env, target_ulong addr, TCGMemOpIdx oi, ^~~~~~~~~~~~~~~~~ In file included from /tmp/qemu-test/src/include/exec/cpu_ldst.h:127:0, --- CC aarch64-softmmu/hw/misc/omap_tap.o CC aarch64-softmmu/hw/misc/bcm2835_mbox.o CC aarch64-softmmu/hw/misc/bcm2835_property.o /tmp/qemu-test/src/accel/tcg/softmmu.c:207:1: error: conflicting types for 'helper_le_ldq_mmu' helper_le_ldq_mmu(CPUArchState *env, target_ulong addr, TCGMemOpIdx oi, ^~~~~~~~~~~~~~~~~ In file included from /tmp/qemu-test/src/include/exec/cpu_ldst.h:127:0, --- /tmp/qemu-test/src/tcg/tcg.h:1307:10: note: previous declaration of 'helper_le_ldq_mmu' was here uint64_t helper_le_ldq_mmu(CPUArchState *env, target_ulong addr, ^~~~~~~~~~~~~~~~~ /tmp/qemu-test/src/accel/tcg/softmmu.c:214:1: error: conflicting types for 'helper_be_ldq_mmu' helper_be_ldq_mmu(CPUArchState *env, target_ulong addr, TCGMemOpIdx oi, ^~~~~~~~~~~~~~~~~ In file included from /tmp/qemu-test/src/include/exec/cpu_ldst.h:127:0, The full log is available at http://patchew.org/logs/20181217150116.10446-1-alex.bennee@linaro.org/testing.docker-mingw@fedora/?type=message. --- Email generated automatically by Patchew [http://patchew.org/]. Please send your feedback to patchew-devel@redhat.com
© 2016 - 2024 Red Hat, Inc.